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Abstract 

We consider the fundamental task of representing a boolean function / by 
a conjunctive normal form (clause-set) F for the purpose of SAT solving. The 
boolean function / here acts as a kind of constraint, like a cardinality con- 
straint or an S-box in a cryptosystem, while _F is a subset of the whole SAT 
problem to be solved. The traditional approach towards "good" properties of 
F considers "arc consistency", which demands that for every partial instan- 
tiation of /, all forced assignments can be recovered from the corresponding 
partial assignment to F via unit-clause propagation (UCP). We propose to 
consider a more refined framework: First, instead of considering the above 
relative condition, a relation between / and F, we consider an absolute condi- 
tion, namely that goodness of F is guaranteed by F being element of a suitable 
target class. And second, instead of just considering UCP, we consider hier- 
archies of target classes, which allow for different mechanisms than UCP and 
allow for size/complexity trade-offs. 

The hierarchy UCk of unit-refutation complete clause-sets of level k, intro- 
duced in |3^, provides the most basic target classes, that is, F £ UCk 
is to be achieved for k as small as feasible. Here UCi — UC has been in- 
troduced in [^ij for the purpose of knowledge compilation. In general, UCk 
is the set of clause-sets F such that unsatisfiable instantiations (by partial 
assignments) are recognisable by fc-times nested unit-clause propagation. We 
also touch upon the hierarchy VCk of propagation complete clause-sets of level 
k, where VCi = VC has been introduced in The hierarchy VCk refines 
the hierarchy UCk by providing intermediate layers. In order to make use of 
full resolution, we consider the hierarchy WCk of width-refutation complete 
clauses-sets of level k, employing an improved notion of width (so that we 
always have UCk C WCk)- 

Via the absolute condition, the quality of the representation F is fully 
captured by the target class, and the only relation between / and F is that F 
must "represent" /. If F does not contain new variables, then this means that 
F is equivalent to /, while with new variables the satisfying assignments of F 
projected to the variables of / must be precisely the satisfying assignments of 
/. Without new variables, the relative and absolute condition coincide, but 
with new variables, the absolute condition is stronger. As we remark in this 
article, for the relative condition and new variables at least the hierarchies UCk 
and VCk collapse, and we also conjecture that the WCk hierarchy collapses. 
The main result of this article is that without new variables, none of these 
hierarchies collapses. That means that there are boolean functions with only 
exponential-size equivalent clause-sets at level k, but with poly-size equivalent 
clause-sets at level fc -|- 1. 



Representations with new variables in general allow shorter representa- 
tions. However representations without new variables can be systematically 
searched for, opening a new algorithmic avenue for good SAT representa- 
tions, where in a pre-processing phase the representation is being optimised. 
When using a two-stage approach, then first non-algorithmically a represen- 
tation with new variables can be constructed, which then can be optimised 
by searching for an equivalent better clause-set. 

We believe that many common CNF representations either already fit into 
the UCk scheme or can be made fit by slight improvements. We give some basic 
tools to construct representations in UCi, now with new variables and based 
on the Tseitin translation. We conclude with a discussion of open problems 
and future directions, with special emphasis on separations for the various 
hierarchies involved. 
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1 Introduction 



It has been shown that the practical performance of SAT solvers can depend heavily 
on the SAT representation used. See for example ^ |4| for work on cardinality 
constraints, |5^, |5^ for work on general constraint translations, and |3^, ^ for 
investigations into different translations in cryptography. In order to obtain "good" 
representations, until now the emphasis has been on translating constraints into 
SAT such that "arc-consistency" is "maintained" , via unit-clause propagation; for 
an introduction into the literature see Section 22.6.7 in while various case- 
studies can be found in [H, |, H H, 0, |. That is, for all (partial) assignments 
to the variables of the constraint, the task is to ensure that if there is a forced 
assignment (i.e., some variable which must be set to a particular value to avoid 
inconsistency), then unit-clause propagation (UCP) is sufficient to find and set 
this assignment. In a similar vein, there is the class VC of propagation-complete 
clause-sets (see 0), containing all clause-sets for which unit-clause propagation is 
sufficient to detect all forced assignments. 

Maintaining arc-consistency and propagation-completeness may at a glance seem 
the same concept. However there is an essential difference. When translating a 
constraint into SAT, typically one does not just use the variables of the constraint, 
but one adds auxiliary variables to allow for a compact representation. Now when 
one speaks of maintaining arc-consistency, one only cares about assignments to 
the constraint variables. However propagation-completeness deals only with the 
representing clause-set, thus can not know about the distinction between original 
and auxiliary variables, and thus it is a property on the (partial) assignments over 
all variables! A SAT representation, which maintains arc-consistency via UCP, will 
in general not be propagation-complete, due to assignments over both constraint 
and new variables yielding a forced assignment or even an inconsistency which 
UCP doesn't detect; see Example |6.9| and Lemma 6.1C. In contrast to this, for the 
basic concepts of "good" representations investigated in this paper, considering all 
variables is a fundamental feature. 

In it is shown that conflict-driven solvers with branching restricted to input 
variables have only superpolynomial run-time on EPHPj^, an Extended Resolution 
extension to the pigeon-hole formulas, while unrestricted branching determines un- 



satisfiability quickly (see Subsection 8.3 for more on this). Also experimentally it is 



demonstrated in [|38| that input-restricted branching can have a detrimental effect 
on solver times and proof sizes for modern CDCL solvers. This adds motivation 
to our fundamental choice of considering all variables (rather than just input vari- 
ables), when deciding what properties we want for SAT translations. We call this 
the "absolute (representation) condition", taking also the auxiliary variables into 
account, while the "relative condition" only considers the original variables. Besides 
avoiding the creation of hard unsatisfiable sub-problems, the absolute condition also 
enables one to study the target classes, like VC, on their own, without relation to 
what is represented. 

In a certain sense, the underlying idea of maintaining arc-consistency and propa- 
gation-complete translations is to compress all of the constraint knowledge into the 
SAT translation, and then to use UCP to extract this knowledge when appropriate. 
Motivated by the absolute condition, in ^ ^ we considered the somewhat 
more fundamental class UC of refutation complete clause-sets, introduced in ||2^ as a 
method for propositional knowledge compilation, and studied its properties. Rather 
than requiring that UCP detects all forced assignments (as for VC), a clause-set is 
in UC iff for all partial assignments resulting in an unsatisfiable clause-set UCP 
detects this. 

So we have UC and VC as potential target classes for "good" SAT representa- 
tions. In both cases we know, that if the SAT solver ends up in an unsatisflable part 
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of the search space, then the ubiquitous unit-clause propagation will immediately 
determine this and the solver will avoid potentially exponential work. However, how 
easy is it to come up with representations in lACl It is easy to come up with exam- 
ples of "good" clause-sets which are not in UC, e.g., 2-CNF. Given that UCP is a 
relatively simple mechanism, perhaps it would be better to consider more powerful 
inference methods allowing for a greater variety and possibly shorter representations 
("more compression")? For this end, to add more power, we introduce "hardness 
measurement" . 



1.1 A general framework: hierarchies and measurement 

In [pol ISlI, using generalised unit-clause propagation rfc (with ri being UCP) 



introduced in 1 42 , 45| , we developed a hierarchy UCk (with UCi — UC) of clause-sets 
of "hardness" at most k, that is, refutation is (always) possible via r^. Replacing ri 
with r^; in the same way in VC yields the propagation-completeness hierarchy VC]^ 
(with VCi — VC). In the limit these hierarchies cover all clause-sets, with the levels 
of the hierarchy offering the possibility to trade complexity of the inference method 
(rfc) for size of the representation. Generalising existing results we show in Lemma 
6.5 of |3^, ^ that various poly-time solvable SAT classes are contained within 
levels of the UCk hierarchy. That is T-LO C TZT-LO C UCi (Horn and renamable 
Horn clause-sets), 2-CCS c QUO c UC2 (2-clause-sets and q-Horn clauses-sets, 
see Section 6.10.2 in and [^) and HOk C UCk (generalised Horn clause-sets. 
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There are strong proof theoretic connections for UCk to tree-resolution. In |p7| 
the argument is made that tree-resolution complexity can not provide a good mea- 
sure of hardness of instances for SAT solving, citing the ability of CDCL solvers 
to simulate exponentially more powerful full resolution (see Q for evidence that 
CDCL solvers can simulate full resolution). However, the aim of UCk is not to 
measure hardness, but to offer a target class for SAT translation. In this respect 
tree-resolution complexity measures are ideal, because they provide the strongest 
translations, and upper-bound measures for full resolution. 

On the other hand, for tighter target classes in the case of full resolution, we 
also consider the notion of width-hardness as introduced in based on the 

width-based hierarchies of unsatisfiable clause-sets in That is, a clause-set 

is in WCfc, the hierarchy of clause-sets of width-hardness fc, iff under any partial as- 
signment resulting in an unsatisfiable clause-set there is a "fc-resolution" refutation 
as introduced in pO| . Here, unlike the typical notion, we allow resolutions where 
only one parent clause needs to have length at most k, and thus properly generalis- 
ing unit-resolution (one could speak of "asymmetric width" here, compared to the 
standard "symmetric width"). This allows to simulate nested input resolution, and 
thus we have UCk C WCk for all fc, whereas otherwise in the standard (symmetrical) 
sense even Horn clause-sets require unbounded width (recall that HO C UCi). 

Fundamental for each hierarchy is an underlying measure ho : USAT — >■ CCS, 
measuring the "hardness" of unsatisfiable clause-sets, which is extended to h : 
CCS —T' No, where h{F) for an arbitrary clause-set F measures the "hardness" to 
derive any conclusion F ^ C for clauses C, by letting h{F) be the maximum of 
ho {if * F) over all partial assignments ip such that application yields an unsatisfiable 
result (fi * F. The hierarchy at level k collects all F with h{F) < k. For the PC- 
UC hierarchy the corresponding measure phd(i^) resp. hd(F) can be described in 
many ways; most intuitive from a SAT point of view is to say that it measures the 
necessary nesting level of UCP, that is, which rfc is required. 
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1.2 Representation of boolean functions 



By definition each VCkMCk-, WCfc is just a class of clause-sets. However when using 
these classes for representing boolean functions, then there are further aspects. In 
general, for translations to SAT a typical path is 

Problem — > Constraints Boolean functions SAT. 

" V ' 

our focus 

By considering target classes for "good" SAT representations, we focus our atten- 
tion on the final stage, the translation of boolean functions to SAT, ignoring the 
issue of encoding non-boolean domains into the boolean. Now there are three main 
dimensions to consider (choices to make): 

1. Inference properties {VCk versus UCk versus WCfc, and the value of k): How 
strong a property we require of the clause-set we translate to (VCk is strongest, 
yVCfe weakest, and the lower k the stronger the condition). 

2. Logical equivalence versus new variables: whether the SAT translation is 
equivalent to the input function (i.e., no new variables), or uses new variables 
to extend the original function. 

3. Relative versus absolute condition: in case new variables are used, whether 
the property we require for the translation refers to partial assignments only 
on the original variables or also on the new variables. 



See Subsection 6.1 for more on the relative condition; our point of view is that the 
absolute condition is fundamental for the representation of boolean functions, not 
the relative condition (which has been dominant in the literature until now). 

In the area of Knowledge Compilation, the task is also to represent ( "compile" ) 
boolean functions to allow good inference under (repeated) queries. In particular, 
one wants to find a representation for a boolean function which allows queries such 
as clausal entailment {F |= C), equivalence, and model counting to be answered 
efficiently (in polynomial time). In this sense, we can think of "finding a good 
representation" as a form of SAT knowledge compilation, where we care (only) 
about clausal entailment, since CNF-clauses directly correspond to falsifying partial 
assignments. |Q gives an overview of the CNF-based target languages (prime 
implicates, UC, 2-CCS, Horn clause-sets). [|5) consider disjunctions of simple CNF 
classes. provides an overview of target compilation languages based on "nested" 
(graph-based) classes, namely variants of NNF, DNNF and BDDs. In all cases query 
complexity and succinctness is investigated. We focus on CNF representations, since 
we want good representations for current resolution-based SAT solvers. All of the 
CNF classes studied in ^ are included at the first three levels of the 
hierarchy UCk, namely, sets of prime implicates in UCq, (renamable) Horn clause- 
sets in UCi = UC, and 2-CCS in UC2- Translations from target classes such as 
DNNF to CNF are also of interest and fit into the framework of UCk via using new 
variables; see Section |6.2| for the most basic positive considerations. And see for 
a basic negative result, characterising what can be represented under the relative 
condition (i.e., arc consistency). 



1.3 Strictness of hierarchies 

A fundamental question is the strictness of these hierarchies VCk, UCk, and WCk 
in each of those two remaining dimensions. That is, whether each level offers new 
possibilities for polysize representations of (sequences of) boolean functions within 
the confines of the specified dimensions, i.e., relative versus absolute and without 
versus with new variables. Using the basic choice of the absolute condition, we 
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have six proper hierarchies (3 conjectured, 3 proven), namely VCk, UCk and WCfc 
for representations without (Theorem ^.13 ) and with new variables (Conjecture 



^^).|^ However when using representations based on the relative condition (and 
using new variables), then all these hierarchies collapse to their first level: two 
collapses are similar to existing results, while the collapse for WCk should follow 



also in this way, and is spelled out as Conjecture 3.4, 

Considered together, under the relative condition only the levels VCq C UCq C 
VCi are strict regarding polysize representations, where the two classes VCq C UCq 
do not gain anything from the new variables, while everything of VCk, UCk and 
WCfc for k > 1 can be reduced (in polytime, with exponent depending on k) to 
VCi = VC. And VC under the relative condition is the same as the well-known 
condition of "arc consistency" for SAT translation. The main result of this paper, 
that VCk, UCk and WCk for the absolute condition and without new variables do 
not collapse, shows that a rich structure was hidden under the carpet of the relative 
condition aka arc consistency. A basic difference between relative and absolute 
condition is that under the relative condition the new variables can be used to 
perform certain "computations" , since there are no conditions on the new variable 
other than not to distort the satisfying assignments, and this is used to show the 
collapse to arc consistency, by encoding the stronger condition into the clause-sets 
in such a way that UCP can perform the "computations" . 

1.4 The UC hierarchy is strict regarding equivalence 

A sequence {fh)heN of boolean functions, which separates lACk+i from lACk w.r.t. 
clause-sets equivalent to fh inUCk+i resp. UCk, should have the following properties: 

1. A large number of prime implicates: the number of prime implicates 
for fh should at least grow super-polynomially in ft,, since otherwise already 
the set of prime implicates is a small clause-set in UCq (see Definition 
equivalent to fh- 

2. Easily characterised prime implicates: the prime implicates of fh should 
be easily characterised, since otherwise we can not understand how clause-sets 
equivalent to fh look like. 

3. Poly-size representations: there must exist short clause-sets in UCk+i 
equivalent to fh for all ft € N. 

Q introduced a special type of boolean functions, called Non-repeating Unate 
Decision trees (NUD) there, by adding new variables to each clause of clause-sets in 
SA4Us=i, which is the class of unsatisfiable hitting clause-sets of deficiency ^ = 1. 
These boolean functions have a large number of prime implicates (the maximum 
regarding the original number of clauses), and thus are natural to consider as can- 
didates to separate the levels of UCk- In Section || we show that there is actually 
a general method behind it: we "dope" arbitrary clause-sets, and show that the 
prime implicates of the doped clause-sets correspond to the "minimal premise sets" 
contained in the original clause-sets. Minimal premise sets strengthen irredundant 
clause-sets by requiring that a clause can be implied such that all clauses are needed. 
In Section^ we introduce the basic method (see Theorem 5.4) for lower bounding 



the size of equivalent clause-sets of a given hardness, via the transversal number of 



"trigger hypergraphs" . Using this lower bound method, in Theorem 5.12 we show a 
lower bound on the matching number of the trigger hypergraph of doped "extremal" 
5A^Z//5=i-clause-sets. From this follows immediately Theorem ^.13 , that for every 



'Regarding VCk we get only a separation of VCk and VCk+2\ this will be addressed in future 
work. 
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A: G No there are polysize clause-sets in lACk+i, where every equivalent clause-set 
in WCk is of exponential size. Thus the UCk as well as the WCk hierarchy is strict 
regarding equivalence of polysize clause-sets. 



1.5 Tools for good representations 

We conclude our investigations by considering translations based on the Tseitin 
translation in Section ^, and show that interesting classes of boolean function can 
be polynomially translated to UC under the absolute condition using new variables. 



First we discuss the notion of "representation" in general in Subsection 3.1, with 
special emphasise on the "relative" versus the "absolute condition" . 

The Tseitin translation for DNF's we call "canonical translation", and we in- 
vestigate it in Subsection 6.2. In particular, in Lemma [6.11 we show that every 



orthogonal (or "disjoint", or "hitting") DNF is translated to UC, while in Lemma 
we show that actually every DNF is translated to UC, when using the "reduced" 



canonical translation, which uses only the necessary part of the equivalences consti- 
tutive for the Tseitin translation. Applied to our examples yielding the separation of 



UCk+i from WCk (Theorem 5.13, regarding polysize representations without new 
variables), we obtain a representation in UC in Theorem 3.13| (for the canonical 



translation), demonstrating the power of using new variables. 

It has been noted in the literature at several places (see ^ ^) that 
one might use only one of the two directions of the equivalences in the Tseitin- 
translation. Regarding the canonical translation we have the full translation (Def- 



inition 6.5) versus the reduced translation (Definition 3.14). The full translation 
yields UC for special inputs (Lemma ^.ll| ), and has relative hardness 1 for general 



DNF (Lemma |6.8[ ) , however (absolut e) ha rdness for arbitrary DNF- inputs can be 



arbitrarily high as shown in Le mma p.lC . On the other hand, the reduced trans- 
lation yields always UC (Lemma |6.16| ). So we have the following explanations why 
using either both directions or only one direction in the Tseitin translation, in the 
context of translating DNF's, can perform better than the other form: 

• When using both direction (i.e., the canonical translation), then splitting on 
the auxiliary variables is powerful, which is an advantage over using only one 
direction (i.e., the reduced canonical translation), where setting an auxiliary 
variable to false says nothing. 

• On the other hand, the canonical translation, when applied to non-hitting 
DNFs, can create hard unsatisflable sub-problems (via partial assignments), 
which can not happen for the reduced translation. 

It seems very interesting to us to turn these arguments into theorems (for concrete 
examples), and also to experimentally evaluate them. In this way we hope that 
in the future more precise directions can be given when to use which form of the 
Tseitin translation. 



In Subsection 3.3 we turn to the translation of "XOR-clauses" . Section 1.5 of 
|3l| , |3^ discusses the translation of the so-called "Schaefer classes" into the UCk 
hierarchy; see Section 12.2 in for an introduction, and see |l^ for an in-depth 
overview on recent developments. All Schaefer classes except affine equations have 
natural translations into either UCi or UC2- The open question is whether systems 
of XOR-clauses (i.e., affine equations) can be translated into UCk for some fixed k. 
We consider the most basic questions in a sense. On the positive side, for single 



XOR clauses, we show in Lemma 6.18 that the Tseitin translation of a typical XOR 
summation circuit is in UC. On the negative side, in Theorem |7.5| , we show for 
all fc > 3 that applying this translation piecewise to systems of just two large- 



enough XOR clauses yields a SAT translation not in UCk- Conjecture 6.20 then 
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hypothesises that, in general, systems of XOR-clauses have no representation of 
bounded hardness. 



1.6 Remarks on the term "hardness" 

In general, if one speaks of the "hardness measure" hd : CCS — > No (Defini- 



tion 2.5) in context with other measures, then one should call it more specifically 
tree-hardness ("t-hardness"), denoted by thd(i^), due to its close relation to tree- 
resolution (and its space complexity). So we have three basic types of hardness- 
measures, namely t-hardness thd(f ), the minimum k with F G lACk, p-hardness 
phd(f ), the minimum k with F £ 'PCk^ and w-hardness, the minimum k with in 
F G WCk- In this article, since thd(F) is still most important here, we denote it by 
hd(i^) =thd(F). 

In what respect is the terminology "hardness" appropriate? The hardness 



measure hd(F) has been introduced in |42, H^, based on quasi-automatisation of 



tree-resolution, that is, on a specific algorithmic approach (close to Stalmarcks ap- 
proach)]^ In ^ hd(F) for unsatisfiable F was proposed as measure of SAT-solver- 
hardness in general. This was criticised in by the argument, that conflict-driven 
SAT solvers would be closer to dag-resolution (full resolution) than tree-resolution. 
Due to their heuristical nature, it seems to us that there is no robust measure 
of SAT-solver-hardness. Instead, our three basic measures, which are robust and 
mathematically meaningful, measure how good a clause-set F is in representing an 
underlying boolean function in the following sense: 

• Regarding instantiation we take a worst-case approach, that is, we consider 
all partial assignments ip and their applications if * F (insofar they create 
unsatisfiability or forced literals). 

A SAT-solver only uses certain partial assignments, and thus this worst-case 
approach is overkill. However when using F in any context, then it makes 
sense to consider all partial assignments. 

• Regarding algorithms, we take a breadth-first approach, that is, the smallest 
k such that ik or /c-resolution succeeds. For k > 1 a SAT-solver might not 
find these inferences. However those algorithms need polynomial time, and 
thus are implementable, and furthermore the maximisation over all partial 
assignments needs to be complemented with a minimisation in order to yield 
something interesting. 

1.7 Overview on results 

Main results on minimal premise sets and doping: 



1. Theorem 3.18 shows the correlation between prime implicates of doped clause- 



sets and minimal premise-sets of the original (undoped) clause-sets. 



2. Theorem 4.11 characterises unsatisfiable clause-sets where every non-empty 



sub-clause-set is a minimal premise set. 



3. Theorem 4. IS gives basic characteristics of doped 5A4i/5=i -clause-sets. 



Main results on lower bounds for the hardness of all equivalent clause-sets: 



^^Using the simplest oracle, on unsatisfiable jnstarires the measure from ^ yields hd(_F). 
But on satisfiable instances the approach of |42| , |45[ is very different, namely an algorithmic 
polynomial-time approach is taken, extending the breadth-first search for tree-resolution refuta- 
tions in a natural way. 
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1. Theorem 5.4 introduces the basic method for lower bounding the size of equiv- 
alent clause-sets of a given hardness, via the transversal number of "trigger 
hypergraphs" . 



2. Theorem 5.12 shows a lower bound on the matching number of the trigger 
hypergraph of doped "extremal" SMUs^i-clause-sets. 



3. Theorem 5.13 shows that for every fc G No there are polysize clause-sets in 
UCk+i, where every equivalent clause-set in WCk is of exponential size. 

Regarding upper bounds, that is, short representations (with new variables) 
with low hardness, we have the following: 



1. Lemmas 6.11 , 3.1€ show how the canonical translation can yield results in UC. 

2. Theorem 3.13 shows that all doped iSA^W^^i-clause-sets (and in fact all doped 
unsatisfiable hitting clause-sets) have short CNF- representations in UC via the 
canonical translation. 



3. Lemma 6.18 shows that translating a single XOR-clause to UC is easy, while 
Theorem 7.5 shows that applying this translation to just two XOR-clauses 
already yields high hardness. 

Finally, in Section B one finds many open problems. 



2 Preliminaries 

We follow the general notations and definitions as outlined in |Q. We use N = 
{1, 2, . . .}, No = N U {0}, and P(Af ) for the set of subsets of set M. 

2.1 Clause-sets 

Let VA be the infinite set of variables, and let CIT = VA U : w G V.4} be the 
set of literals. We use L :— {x : x € L} to complement a set L of literals. A clause 
is a finite subset C C CXT which is complement-free, i.e., C n C = 0; the set of 
all clauses is denoted by CC. A clause-set is a finite set of clauses, the set of all 
clause-sets is CCS. By var(a;) g VA we denote the underlying variable of a literal 
X e CIT: and we extend this via var(C) := {var(a;) : x G C} C VA for clauses C, 
and via var(F) :— Ucei=" var(C) for clause-sets F. The possible literals in a clause- 
set F are denoted by lit(F) :— var(F) U var(F). Measuring clause-sets happens by 
n{F) := |var(i^)| for the number of variables, c{F) := \F\ for the number of clauses, 
and £{F) :— J^cepl^l ^'^^ ^^'^ number of literal occurrences. A special clause-set 
is T := G CCS, the empty clause-set, and a special clause is _L := G CC, the 
empty clause. 

A partial assignment is a map (/s : F — > {0, 1} for some finite V C VA, where we 
set vai{(p) := V, and where the set of all partial assignments is VASS. For v G var((^) 
let (p{v) := (p{v) (with = 1 and 1 = 0). We construct partial assignments by terms 
{xi — >■ ei,...,Xn — > £ri) G TASS for literals xi,...,Xn with different underlying 
variables and Si G {0, 1}. We use (pc ■— {x ^ : x £ C) for the partial assignment 
setting precisely the literals in clause C G CC to false. 

For ip G TASS and F G CCS we denote the result of applying Lp to F hy ip* F, 
removing clauses C d F containing x £ C with p{x) — 1, and removing literals x 
with (p{x) = from the remaining clauses. By SAT := {F G CCS | G VASS : 
ip*F — T} the set of satisfiable clause-sets is denoted, and by US AT CCS\SAT 
the set of unsatisfiable clause-sets. 
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So clausal entailment, that is the relation F \= C ior F G CCS and C G CjC, 
which by definition holds true iff for all (f S VASS with (p * F = T we have 
(p* {C} — T, is equivalent to ipc * F G US AT. 

Two clauses C,D G CC are resolvable if they clash in exactly one literal x, 
that is, C n -D = {x}, in which case their resolvent is C oD := (C U -D) \ {x,x} 
(with resolution literal x). A resolution tree is a full binary tree formed by the 
resolution operation. We write T : F \- C if T is a resolution tree with axioms 
(the clauses at the leaves) all in F and with derived clause (at the root) C. A 
resolution tree T : F\- C is regular iff along each path from the root of T to a leaf 
no resolution-variable is used more than once. A final remark; In this article wc 
use only resolution trees, even when speaking of unrestricted resolution (that is, we 
always unfold dag-resolution proofs to (full) binary resolution trees). 

A prime implicate of _F G CCS is a clause C such that a resolution tree T with 
T -.F^ C exists, but no T' exists for some C" C C with T' -.F^ C; the set of aU 
prime implicates of F is denoted by prCo(-F') G CCS. The term "implicate" refers 
to the; implicit interpretation of F as a conjunctive normal form (CNF). Considering 
clauses as combinatorial objects one can speak of "prime clauses" , and the "0" in 
our notation reminds of "unsatisfiability" , which is characteristic for CNF. Two 
clause-sets F,F' G CCS are equivalent iff prcQ(F) = prcQ(_F'). A clause-set F is 
unsatisfiable iff prco(-F) = {-L}- The set of prime implicants of a clause-set F G CCS 
is denoted by prcj^(F) G CCS, and is the set of all clauses C G CC such that for all 
D G F we have C n D 7^ 0, while this holds for no strict subset of C. 

As we said, the default interpretation of a clause-set is as a CNF, which we 
can emphasise by speaking of the "CNF-clause-set F" , that is, the interpretation 
as a boolean function is 

F-^ /\ y X. 

ceFxec 

We might consider F also as a DNF-clause-set, which does not change F itself, but 
only changes the interpretation of F in considerations regarding the semantics: 

F^\/ f\x. 

C&Fx&C 

The above description of the sets prco(i^), prCi(F) as the set of prime implicates 
rcsp. implicants holds for the default interpretation of F as CNF, while for the DNF- 
interpretation prcQ(F) becomes the set of prime implicants, while prci(F) becomes 
the set of prime implicates (of the boolean function underlying F). A CNF-clause- 
set F is equivalent to a DNF-clause-set G iff prcQ(-F) = prc^(G). The total satisfying 
assignments of a (CNF-)clause-set F can be identified with the elements of the 
canonical DNF of F, which is defined via the map DNF : CCS — )■ CCS, where for 
F G CCS we set DNF(F) := {C gCC \ var(C) = var(F) A'i D G F : C f\ D %}. 

While clause-sets and partial assignments themselves are neutral regarding CNF- 
or DNF-interpretation, the application 9? * F is based on the CNF-interpretation 
of F; if wc wish to use the DNF-interpretation of F, then wc use Tp * F, where 
Tp := {v ip{v) : V G var((p)). While T in the CNF-interpretation stands for 
"true" , in the DNF-interpretation it becomes "false" . 

Example 2.1 Consider F := {{a},{b}} G CCS (with n{F) = c{F) = £(F) = 2). 
Then DNF(F) = {{a,b}}, and for ip := {a,b 1) we have p* F = T. This 

corresponds to the CNF-interpretation a Ah of F, which has exactly one satisfying 
assignment (p. If we consider the DNF-interpretation aV b of F, then we have three 
satisfying total assignments for the DNF-clause-set F, and for example the satisfying 
assignment tjj (a 1) is recognised via ^*F^{a^O)*F = {_L, {b}}, where 
the result as DNF is a tautology, since J- as a DNF-clause becomes the constant 1 
(as the empty conjunction). 
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A basic problem considered in this article is for a given F G CCS to find a 
"good" equivalent F' G CCS. A lower bound for F' is given by the essential 
prime implicates, which are those C G prcQ(i^) such that prcQ(i^) \ {C} is not 
equivalent to F: 

Lemma 2.2 Consider F G CCS, and let P C prCQ(i^) he the set of essential prime 
implicates of F. Now for every F' G CCS equivalent to F there exists an injection 
i:P^F' such that for all C & P holds C C i{C). Thus c{F') > c{P). 

Proof: For every C G F' there exists a C G prcQ(i^) such that C C C'; replacing 
every C" G F by such a chosen C we obtain F" C prCo(F) with P C F" . □ 



2.2 Hardness and UCk 



We now turn to the most basic hardness measurement. It can be based on resolu- 
tion refutation trees, as we do here, but it can also be defined algorithmically, via 
generalised unit-clause propagation (see Lemma 2.6). 



Definition 2.3 For a full binary tree T the height ht(T) G No and the Horton- 
Strahler number hs(T) G No are defined as follows: 

1. IfT is trivial (i.e., #nds(T) = I), then \ii{T) := and hs(r) := 0. 

2. Otherwise let Ti,T2 be the two subtrees ofT: 

(a) ht(r) := 1 + max(ht(Ti),ht(r2)) 

(b) //hs(ri) = hs(r2), then hs{T) := 1 + max(hs(ri), hs(r2)), otherwise 
hs(T) :=max(hs(Ti),hs(T2)). 

Obviously we always have hs(r) < ht(r). 



Example 2.4 For the tree T from Example ^.S we have ht(T) ~ 3, hs(r) = 2. 
The Horton-Strahler numbers of the subtrees are as follows: 




Definition 2.5 The hardness hd : CCS — !■ Nq is defined for F G CCS as follows: 

1. If F e US AT, then hd(F) is the minimum hs(T) /or T : F h _L. 

2. If F ^ T, then hd(F) 0. 

3. If F e SAT \ {T}, then hd(F) := max^g7M55{hd(<p * F) : tp * F e US AT}. 

Hardness for unsatisfiable clause-sets was introduced in |42], ^ , while this gen- 
eralisation to arbitrary clause-sets was first mentioned in ffi and systematically 
studied in ^ |3^. Definition 2.5 defines hardness proof-theoretically; impor- 
tantly, it can also be characterised algorithmically via necessary levels of generalised 
unit-clause propagation (see|^, ^ ^ for the details): 
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Lemma 2.6 Consider the reductions ; CCS — > CCS for fc G Nq as introduced in 
; it is ri unit-clause propagation, while 12 is (full, iterated) failed-literal elimini- 
ation. Then hd(i^) for F € CCS is the minimal fc € Nq such that for all ip £ VASS 
with Lp* F £ IAS AT holds Tk{(p * F) ~ {-L}, i.e., the minimal k .such that ik detects 
unsatisfiability of any instantiation. 

We can now define our main hierarcliy, tlie Z^Cfe-liierarcliy (with. "UC" for "unit- 
refutation complete") via (tree-)hardness: 

Definition 2.7 For k e No let UCk := {F E CCS : hd{F) < k}. 

UCi = UC is the class of unit-refutation complete clause-sets, as introduced in |p3| . 
In 11^, ^ we show that UC = SCUTZ, where SCUTZ is the class of clause-sets 
solvable via Single Lookahead Unit Resolution (see (27|). Using we then obtain 
( ||30| , ^ ^ ) that membership decision for UCk (= SCUTZk) is coNP-complete for 
A; > 1. The class UC2 is the class of all clause-sets where unsatisfiability for any 
partial assignment is detected by failed-literal reduction (see Section 5.2.1 in |Q 
for the usage of failed literals in SAT solvers). 

A basic fact is that the classes UCk are stable under application of partial assign- 
ments, in other words, for F G CCS and (p G VASS we have hd{ip*F) < hd(F). For 
showing lower bounds on the hardness for unsatisfiable clause-sets, we can use the 
methodology developed in Subsection 3.4.2 of A simplified version of Lemma 
3.17 from |Q, sufficient for our purposes, is as follows: 

Lemma 2.8 Consider C C US AT and a function ft, : C — > Nq. For k E Nq let 
Ck := {F EC : h{F) > k}. Then VF G C : hd(F) > h{F) holds if and only if for 
all k eN, F E Ck and x E lit(i^) there exist clause-sets Fq,Fi E CCS fulfilling the 
following two conditions: 

1. hd(Fo) < hd((a; ^ 0) ^ F) and hd(Fi) < hd((a; 1) * F); 

2. FoECk orFiECk-i. 

Complementary to "unit-refutation completeness" , there is the notion of "pro- 
pagation-completeness" as investigated in [|2[ |ll[, yielding the class VC C UC. 
This was captured and generalised by a measure phd : CCS — > No of "propagation- 
hardness" along with the associated hierarchy, defined in ^ as follows: 

Definition 2.9 For F E CCS we define the propagation-hardness (for short 
"p-hardness") phd(i^) G No as the minimal fc G No such that for all partial assign- 
ments ip E VASS we have rk{(p * F) = Tooiip * F), where r^ : CCS — CCS is gener- 
alised UCP (^^, P^/J, and Too ■ CCS — >■ CCS applies all forced assignments, and can 
be defined by r^(F) := r„(^)(F). For fc G No let VCk := {F E CCS : phd(i^) < k} 
(the class of propagation-complete clause-sets of level k). 

We have VC = VCi. For fc G No we have VCk C UCk C VCk+i. 
2.3 W-Hardness and WCk 

A basic weakness of the standard notion of width-restricted resolution, which de- 
mands that both parent clauses must have length at most fc for some fixed fc G No 
(the "width"), is that even Horn clause-sets require unbounded width in this sense. 
The correct solution, as investigated and discussed in ^5), is to use the no- 
tion of "fc-resolution" as introduced in ]40[ |, where only one parent clause needs 
to have length at most fc (thus properly generalising unit-resolution). Nested 
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input-resolution (|^, is the proof-theoretic basis of hardness, and approxi- 
mates tree-resolution. In the same vein, /c-resolution is the proof-theoretic basis of 
"w-hardness" , and approximates dag- resolution (see Theorem 6.12 in p5[): 

Definition 2.10 The w-hardness whd : CCS — > Nq ("width-hardness") is defined 
for F G CCS as follows: 

1. If F Cz IAS AT, then whd(_F) is the minimum A; G No such that k-resolution 
refutes F , that is, such that T : _F h _L exists where for each resolution step 
R = C o D in T we have \C\ < k or \D\ < k (this corresponds to Definition 
8.2 in /^/, and is a special case of widu introduced in Subsection 6.1 of 

2. If F = T, then whd(i^) 0. 

3. IfFe SAT \ {T}, then whd(F) := max {whd(v3 * i^) : ^ * F e US AT}. 
For fc e No let WCk := {F G CCS : whd{F) < k}. 

We have WCo = UCq, WCi = UCi, and for all fc e No holds Z^C^ C WCk (this follows 
by Lemma 6.8 in |Q for unsatisfiable clause-sets, which extends to satisfiable clause- 
sets by definition). For unsatisfiable F, whether whd(F) = k holds for k e {0, 1, 2} 
can be decided in polynomial time; this is non-trivial for k = 2 ([l^) and unknown 
for k > 2. Nevertheless, the clausal entailment problem F |= C for F S WCk and 



fixed A; G No is decidable in polynomial time, as shown in Subsection 6.5 of 1 45 , by 
actually using a slight strengthening of fc-resolution, which combines width-bounded 
resolution and input resolution. While space-complexity of the decision F \= C for 
F g UCk is linear (for fixed k), now for WCk space-complexity is 0(^(F)-n(F)°(*^)). 

As a special case of Theorem 6.12 in we obtain for F e USAT , n{F) ^ 0, 
the following general lower bound on resolution complexity: 

wlid(F)^ 

CompR (F) > h "(^) , 

where 6 := = 1.1331484..., while Compp;(F) G N is the minimal number of 
different clauses in a (tree- ) resolution refutation of F. Similar to Theorem 14 in 
|30| resp. Theorem 5.7 in [^l|, ^ we thus obtain: 

Lemma 2.11 For F E CCS and k E No, such that for every C E prCo(F) with 

\C\ < n{F) there exists a resolution proof of C from F using at most 6"<*7hcT 
different clauses, we have whd(F) < k. 



3 Minimal premise sets and doped clause-sets 

In this section we study "minimal premise sets", "mps's" for short, introduced in 
|47| , together with the properties of "doped" clause-sets, generalising a construction 
used in [Q . Mps's are generalisations of minimally unsatisfiable clause-sets stronger 
than irredundant clause-sets, while doping relates prime implicates and sub-mps's. 

Recall that a clause-set F is minimally unsatisfiable if F € US AT, while for all 
C E F holds F \ {C} E SAT. The set of all minimally unsatisfiable clause-sets is 
M-IA C CCS; see for more information. In other words, for F E CCS we have 
F E M.U if and only if F ^ _L and F is minimal regarding this entailment relation. 
Now an mps is a clause-set F which minimall y im plies some clause C, i.e., F |= C, 



while F' ^ C for all F' C F. In Subsection 3.1 we study the basic properties of 
mps's F, and determine the unique minimal clause implied by F as puc(F), the set 
of pure literals of F. 
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For a clause-set F its doped version D(i^) € CCS receives an additional new 
("doping") variable for each clause. The basic properties are studied in Subsection 



3.2, and in Theorem 3.18 we show that the prime implicates of D(_F) correspond 



1-1 to the mps's contained in F. In Subsection 3.3 we determine the hardness of 
doped clause-sets. 

3.1 Minimal premise sets 

In Section 4.1 in [4^ basic properties of minimal premise sets are considered: 

Definition 3.1 >1 clause-set F e CCS is a minimal premise set ("mps") for a 
clause C G CC if F \= C and \/ F' C F : F' ^ C , while F is a minimal premise 
set if there exists a clause C such that F is a minimal premise set for C . The set 
of all minimal premise (clause-)sets is denoted by ADIT'S. 

Remarks: 

1. T is not an mps (since no clause follows from T). 

2. An unsatisfiable clause-set is an mps iff it is minimally unsatisfiable, i.e., 



MVS n USAT = MU. In Corollary 3.8 we wiU see that the minimally 



unsatisfiable clause-sets are precisely the mps's without pure literals. 

3. Every minimal premise clause-set is irredundant (no clause follows from the 
other clauses). 

4. For a clause-set F and any implicate F \= C there exists a minimal premise 
sub-clause-set i^' C F for C. 

5. A single clause C yields an mps {C}. 

6. Two clauses C ^ D yield an mps {C, £>} iff C, D are resolvable. 

7. If Fi,F2 € MVS with var(i^i) n var(i^2) = 0, then Fi U ^ MVS except in 
case of Fi = F2 = {-L}- 

Example 3.2 {{a}, {&}} for variables a ^ b is irredundant but not an mps. 

With Corollary 4.5 in we see that no clause-set can minimally entail more 
than one clause: 



Lemma 3.3 For F G MVS there exists exactly one C € prcQ(_F) such that C is a 
minimal premise set for C , and C is the smallest element of the set of clauses for 
which F is a minimal premise set. 

We wish now to determine that unique prime implicate C which follows from 
an mps F. It is clear that C must contain all pure literals from F, since all clauses 
of F must be used, and can can not get rid off pure literals. 

Definition 3.4 For F G CCS the pure clause of F , denoted by puc(F) e CC, 

is the set of pure literals of F, that is, puc(F) :— L \ (L D L), where L :^ [J F is 
the set of literals occurring in F. 

Example 3.5 For F ~ {{a, 6}, {a, c}} we have puc(F) = {b, c}. 

The main observation for determining C is that the conclusion of a regular 
resolution proof consists precisely of the pure literals of the axioms (this follows by 
definition) : 
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Lemma 3.6 For a regular resolution proof T : F \- C , where every clause of F is 
used in T, we have C — puc(i^). 

Due to the completeness of regular resolution we thus see, that puc(i^) is the 
desired unique prime implicate: 

Lemma 3.7 For F G MVS the unique prime implicate C , for which F is a mini- 



mal premise set (see Lemma 3^), is C — puc(i^) 



Proof: Consider a regular resolution proof T : F \- C (recall that regular resolution 
is complete); due to G MVS every clause of F must be used in T, and thus the 



assertion follows by Lemma 3.6. □ 



Corollary 3.8 // we have F G MVS with puc{F) = _L, then F € MU. 

By Lemma 4.4 in ||47|| we get the main characterisation of mps's, namely that 
after elimination of pure literals they must be minimally unsatisfiable: 

Lemma 3.9 Consider a clause-set F G CCS. Then F G MVS if and only if the 
following two conditions hold for ip := '^puc(F) (setting precisely the pure literals of 
F to false): 

1. ip*F G MU (after removing the pure literals we obtain a minimal unsatisfiable 
clause-sets). 

2. Lp is contraction- free for F , that is, for clauses C,D ^ F with C ^ D we have 

{C} ^Lp*{D}. 

These two conditions are equivalent to stating that Lp * F as a multi- clause- set (not 
contracting equal clauses) is minimally unsatisfiable. 

Thus we obtain all mps's by considering some minimally unsatisfiable clause-sets 
and adding new variables in the form of pure literals: 

Corollary 3.10 The following process generates precisely the F' G MVS: 

1. Choose F G MU. 

2. Choose a clause P with var(P) n var(i^) = ("P" like "pure"). 

3. Choose a map e : F ^ F{P) ("e" like "extension"). 

4. LetF' := {CUe(C) : C G F}. 

For unsatisfiable clause-sets the set of minimally unsatisfiable sub-clause-sets 
has been studied extensively in the literature; see |Q for a recent overview. The 
set of subsets which are mps's strengthen this notion (now for all clause-sets): 

Definition 3.11 For a clause-set F G CCS by mps(F) C CCS the set of all 

minimal premise sub-clause-sets is denoted: mps(_F) :— ¥{F) D MVS. 

We have |mps(F)| < 2'^'^^^ — 1 (there is a typo in Corollary 4.6 of ^7|, misplacing 
the "—1" into the exponent). The minimal elements of mps(F) are {C} G mps(f ) 
for C Cz F. Since every prime implicate of a clause-set has some minimal premise 
sub-clause-set, we get that running through all sub-mps's in a clause-set F and 
extracting the clauses with the pure literals we obtain at least all prime implicates: 
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Lemma 3.12 For F G CCS the map F' £ mps(F) puc(F') C {C E CC : F 
C} covers ptCq{F) (i.e., its range contains the prime implicates of F). 

Example 3.13 Examples where we have more minimal premise sub-clause-sets 
than prime implicates are given by F € A4U, where ptCq{F) = {-L}, while in 
the most extreme case every non-empty subset of F can be a minimal premise sub- 



clause-set (see Theorem (.11). 



3.2 Doping clause-sets 

"Doping" is the process of adding a unique new variable to every clause of a clause- 
set. It enables us to follow the usage of this clause in derivations: 

Definition 3.14 For every clause-set F e CCS we assume an injection : F — ^ 
V,4\var(_F') in the following , assigning to every clause C a different variable u^. For 
a clause C € CC and a clause-set F € CCS we then define the doping Dp(C) := 
C U {u^} e CC, while D(F) :^ {Df{C) : C G F} e CCS. 

Remarks: 

1. " Doping" has various meanings, where here we mean the meaning as explained 



in |Wikipedia| the correct German translation is "dotieren" 



2. In the following we drop the upper index in "u^" , i.e., we just use ''uc" ■ 



3. We have D : CCS SAT. 

4. For F e CCS we have n(D(F)) = n(F) + c(F) and c(D(F)) = c{F). 

5. For F e CCS we have puc(D(F)) = puc(F) U {uc : C e F}. 

We are interested in the prime implicates of doped clause-sets. It is easy to see that 
all doped clauses are themselves essential prime implicates: 

Lemma 3.15 For F £ CCS we have D(F) C prCg(D(F)), and furthermore all 
elements ofD{F) are essential prime implicates. 

Proof: Every resolvent of clauses from D(F) contains at least two doping variables, 
and thus the clauses of D(i^) themselves (which contain only one doping variable) 
are prime and necessary. □ 
Thus by Lemma p.2| among all the clause-sets equivalent to D(F) this clause-set 



itself is the smallest. Directly by Lemma 3.9 we get that a clause-set is an mps iff 
its doped form is an mps: 

Lemma 3.16 For F e CCS holds F e MVS ^ D(F) e MVS. Thus the map 
F' G mps(-F) I— > D(_F') is a bisection from mps(i^) to mps(D(_F)). 

For doped clause-sets the surjection of Lemma ^.12| is bijective: 

Lemma 3.17 Consider a clause-set F e CCS, and let G := D(F). 

1. The map F' G mps(G) i— >■ puc(_F') G CC is a bijection from mps(G') to 
prCo(G). 

2. The inverse map from prCo(G) to mps(G) obtains from C G prcQ(G) the 
clause-set F' G mps(G) with puc(F') = G as F' = {D(D) : D e F Aud £ 
var(G)}. 
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Proof: By Lemma ^.12 it remains to show that the map of Part |^ is injective and 



does not have subsumptions in the image. Assume for the sake of contradiction there 
are G',G" € mps(G), G" ^ G", with puc(G") C puc(G"). Since every clause of F 
has a different doping- variable, G' C G" must hold. Consider the F',F" e mps(F) 
with B{F') = G' and D(F") = G". We have F' C F" , and thus puc(i^') g puc(F"), 
since for every F e MVS the clause puc(i^) is a prime implicate of F. It follows 
that puc(G') 2 puc(G"), contradicting the assumption. □ 



By Lemma 3.16 and Lemma 3.17 we obtain 



Theorem 3.18 Consider F e CCS. Then the map F' € mps(F) puc(D(F)) e 
CC is a bijection from mps(_F) to prCQ(D(_F)). 



Theorem 3.18 together with the description of the inversion map in Lemma 3.17 



yields computation of the set mps(i^) for F G CCS via computation of prcQ(D(i^)). 



Corollary 3.19 For F G CCS we obtain a map from prCo(D(i^)) to the set of 
implicates of F covering ptCq(F) by the mapping G G prCQ(D(i^)) t-^ C \V for 
V :^ {uc -.G (E F}. 

Proof: The given map can be obtained as a composition as follows: For G G 
prcQ(D(F)) take (the unique) F' G mps(F) with puc(D(F')) = G, and we have 
G\F = puc(F'). □ 



3.3 Hardness of doped clause-sets 

The hardness of a doped clause-set is the maximal hardness of sub-clause-sets of 
the original clause-set: 

Lemma 3.20 For F G CCS we have hd(D(i^)) = maxF'CF hd(i^')- 

Proof: We have hd(F') < hd(D(i^)) for all F' C F, since via applying a suitable 
partial assignment we obtain F' from F, setting the doping-variables in F' to false, 
and the rest to true. And if we consider an arbitrary partial assignment ip with 
(p*D{F) G US AT, then w.l.o.g. all doping variables are set (we can set the doping- 
variables not used by ip to true, since these variables are all pure) , and then we have 
a partial assignment making F' unsatisfiable for that F' G US AT given by all the 
doping variables set by ip to false. □ 



Example 3.21 For an example of a clause-set F G US AT with hd(D(F)) > hd(F) 
consider any clause-set F' G CCS with hd(i^') > 0, and then take F := F' U {_L} 
(note that _L ^ F'). Thus hd(F) = 0. And by Part 1 of Lemma 6.5 m |||/, all 
UCk are closed under partial assignments, so for cp := {u± — > 1)U {uc — ^ | G G F') 
we have hd(D(F)) > hd(^ * B{F)) = hd(F') > hd(F) = 0. 



4 Doping tree clause-sets 



As explained in Subsection 1.4, we want to construct boolean functions (given by 
clause-sets) with a large number of prime implicates, and where we have strong con- 
trol over these prime implicates. For the purpose we dope "minimally unsatisfiable 
clause-sets of deficiency 1", that is the elements of SAiUs=i. First we review in 



Subsection 4.1 the background (for more information see pi[). Then in Subsection 
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4.2 we consider doping of these special clause-sets. In Theorem 4.11 we show that 
F € SA4Us=i are precisely the clause-sets such that every non-empty subset is an 
mps, and in Theorem 4.18 we determine basic properties of D(i^). 



4.1 Preliminaries on minimal unsatisfiability 

A minimally unsatisfiable F d MU is saturated minimally unsatisfiable iff for all 
clause C G F and for every literal x with var(x) ^ var(C) the clause-set (F \ C) U 
(C U {x}) is satisfiable. The set of all saturated minimally unsatisfiable clause-sets 
is denoted by SM.U C MU. By SM.Us=k we denote the set of F e SMU with 
5{F) = k, where the deficiency of a clause-set F is given by S{F) :— c{F) — n{F). 
In (generalised in [^) it is shown that the elements oiSMUs^i are exactly the 
clause-sets introduced in [|l^. The details are as follows. For rooted trees T we use 
nds(r) for the set of nodes and Ivs(T) C Ivs(T) for the set of leaves, and we set 
#nds(T) := |nds(r)| and #lvs(T) := |lvs(T)|. In our context, the nodes of rooted 
trees are just determined by their positions, and do not have names themselves. 
Another useful notation for a tree T and a node w is , which is the sub-tree of T 
with root w; so Ivs(T) = {w g nds(T) : #nds(T„) — 1}. Recall that for a full binary 
tree T (every non-leaf node has two children) we have #nds(r) = 2 ^\ys{T) — 1. 

Definition 4.1 Consider a full binary tree T and an injective vertex labelling u : 
(nds(r) \ Ivs(T)) — > VA for the inner nodes; the set of all such pairs is denoted 
by Ti ■ The induced edge-labelling assigns to every edge from an inner node w to 
a child w' the literal u{w) resp. u(w) for a left resp. right child. We define the 
clause-set representation F^(T,u) (where "1" reminds of deficiency 1 here; see 
Lemma ^.i) to be 7^{T,u) := {Cw '■ w G Ivs(T)}, where clause Cw consists of all 
the literals (i.e., edge-labels) on the path from the root of T to w. 

By Lemma C.5 in ||: 

Lemma 4.2 : Ti ^ SAiUs=i is a bijection. 

By : SMUs=i ^ Ti we denote the inversion of F^. Typically we identify 
(T, u) G 7i with T, and let the context determine u. So T^ {F) is the full binary 
tree, where the variable v labelling the root (for F ^ {-L}) is the unique variable 
occurring in every clause of F, and the clause-sets determining the left resp. right 
subtree are {v 0)* F resp. (u — >■ 1) * F. By wc for C G F we denote the leaf w of 
T^(F) such that C^, = C. Furthermore we identify the literals of F with the edges 
of Ti(F). Note that c{F) = #\vs{T^{F)) and n(F) = #nds(Ti(F))-#lvs(Ti(F)). 

Example 4.3 Consider the following labelled binary tree T: 




Then F^{T) = {{wi, W2, ws}, ^2, -fs}, {"i, -^2, W4}, {wi, ^'2, ^4}, "i^s}, {^i, ws}}, 
where for example C3 = {fi, ^2, W4} and w^^^ -y^y = 6. 
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The effect of applying a partial assignment to some element of SA4Us=i is easily 
described as follows: 



Lemma 4.4 Consider F G SA4Us=i and a literal x G lit(F); let (p :— {x 1) and 
F' := if* F. We have: 

1. F' G SMUs=i. 

2. Let T :— T^(_F) and T' := T^(_F"). The tree T' is obtained from T as follows: 

(a) Consider the node w (z T labelled with var(2;). Let Tx,Tx be the two 
subtrees hanging at w, following the edge labelled with x resp. x. 

(b) Now T' is obtained from T' be removing subtree T^, and attaching T-^ 
directly at position w. 

Corollary 4.5 SA4Us=i is stable under application of partial assignments, that is, 
for F G SMUs=i and ip G VASS holds (p*F e SMUs=i. 



From Lemma ^ follows SMUs=i C UUIT , where HIT C CCS is the set 
of hitting clause-sets, that is, those F G CCS where every two clauses clash in 
at least one literal, i.e., for all C,D £ F, C ^ D, we have |C n D| > 1, and 
UUXT := nxr^USAT. it is well-known that UUTT C SMU holds (for a proof 
see Lemma 2 in psf). 

4.2 Doping SMUs=i 

We are interested in clause-sets which have as many sub-mps's as possible: 

Definition 4.6 A clause-set F ^ T is a total mps i/mps(F) = P{F) \ {T}. 
Every total mps is an mps. 

Example 4.7 {{a,b}, {a,b}, {b}} is a total mps, while {{a,b}, {a}, {b}} is an mps 
(since minimally unsatisfiable) , but not a total mps. 



By Lemma B.9 and Corollary 3.8 we get: 



Lemma 4.8 A clause-set F is a total mps if and only if F' :— <y3puc(F) * F is total 
mps, and 'y9puc(_F) contraction- free for F. Lf F is a total mps, then thus we have 
F' G MU. 

To determine all total mps's, it remains to determine the minimally unsatisfiable 
total mps's. Before we can prove that these are precisely the saturated minimally 
unsatisfiable clause-sets of deficiency 1, we need to state a basic property of these 
clause-sets, which follows by definition of T^ (F) for F G SMUs=i (recall Subsection 



y) 



Lemma 4.9 Consider F G SMUs=i and F' C F. Let T := T'^{F). The set 
puc{F') of pure literals of F' can be determined as follows: 

1. Let Wp' '■= {wc ■ C G F'} C Ivs(T) be the set of leaves corresponding to the 
clauses of F' . 

2. For a literal x G lit(i^) let w G nds(T) be the node labelled with vslt{x), and 
let Tx the the subtree of w reached by x, and let T— be the subtree of w reached 
by X. 
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3. Now X e puc{F') if and only if Wp' n Ivs(T^) 7^ and Wp' n lvs(r3;) = 0. 



Example 4.10 Consider the clause-set 

F ■■= { {Vl , V2 , V3} , {Vl , V2 , W} , {Vl , V2, Vih frl : 

Ci C2 C3 C4 

{jJT, -g^^s , «6 } , «5 , ^} , {vi , JE} } 
C5 Cg C7 

and the subset F' :— {Ci, C3, C4, C7}. T/ie tree T^{F) is as follows, with the dashed 
edges representing literals not m IJF' = {1^1, W2, "^3, V4, Ff, ^2, t^I, W}' 




12 3 

We have Wp- = {1,3,4,7} and 

puc(i^') = IJ-F' \ { , ^^1^ ' } = {vsiVb}- 

Ci 5 Ca Ci 5 C7 C3 ) 

Now consider x € lit(_F); 

1. For X = V3 holds \vs{Ty^)n Wp' = {1} and T^f] Wp' = 0, thus V3 G puc(F'). 

2. Forx^vE holds lvs(T— ) n Wp' = {7} and Ty^^ f] Wp, = 0, thus vE G puc(F'). 

3. Considering for example x ~ v\, we have lvs(Ty-^)r]Wpi = {1, 3} anc?lvs(T— )n 
Wp' — {7}, thus Vl ^ puc(F'), while for x — vq we have Ivs(Tug) n Wp' = 
and Ivs(r^) n Wp, = 0, thus vq puc{F'). 

Theorem 4.11 An unsatisfiable clause-set F G IAS AT is a total mps if and only 
ifFe SMUs=i. 

Proof: First assume that F is a total mps. Then every two clauses C,D G F, 
C ^ D, clash in exactly one literal (otherwise {C,D} ^ MVS). In Q, Corollary 
34, it was shown that that an unsatisfiable clause-sets F has precisely one clash 
between any pair of different clause-sets iff F G SM.Us=i holds (an alternative 
proof was found in [|4|).0 Now assume F G SA4Us=i, and we have to show that F 
is a total mps. So consider F' G P(F) \ {T}, and let C := puc(F), ip :— ipc- Si nce 
F' is a hitting clause-set, ip is contraction- free for F' , and according to Lemma \i.9\ 
it remains to show that F" := ip * F' is unsatisfiable (recall that hitting clause-sets 
are irredundant). Assume that F" is satisfiable, and consider a partial assignment 
ip with ip * F" — T and va.r{ip) D vai{(p) = 0. We show that then ipUip would be 
a satisfying assignment for F, contradicting the assumption. To this end it suffices 
to show that for all D e F \ F' holds C n D ^ Consider T := T^F), and let 



^'in |44| the notation '^UHXF' was used to denote "uniform hitting clause-sets", which is 
now more appropriately called "(conflict-)regular hitting clause-sets", while "U" now stands for 
"unsatisfiable" . 



20 



Wf' be defined as in Lemma 4.9. Starting from the leaf w^), let w be the first node 



on the path to the root of T such that one of the two subtrees of w contains a leaf 



of Wp' ■ Let X be the literal at w on the path to wd- So by Lemma 4.9 we have 



X (£ C, while by definition x £ D. □ 



The proof of Theorem 4.11 actually shows that for F e US AT already from all 
2-element subsets of F being mps's follows F G SA4Us=i. We are turning now 
our attention to a closer understanding of the prime implicates C of doped F £ 
SAiUs=i. We start with their identification with non-empty sub-clause-sets F' of 
BiF): 

Lemma 4.12 Consider a clause-set F £ SMlAs=i. By Theorem J^.ll each non- 
empty subset yields a minimal premise set. Thus by Theorem 3. It we have: 



1. prCo(D(F)) = {puc(i^^') I T ^ i^^' C D(F)}. 

2. |prCo(D(i^))| = 2'=(^) - 1. 

The main result of |Q is the stronger result that the clause-sets F e CCS with 
|prCo(F)| = 2'^^^) — 1 are precisely the clause-sets D(F) for F e SMU5=i when 
allowing to replace the single doping variable of a clause by any non-empty set of 
new (pure) literals. Back to the task at hand: Since the clauses of D(F) can be 
identified with leaves of the tree T^(i^), we obtain a bijection between non-empty 
sets V of leaves of the tree T^(F) and prime implicates of D(i^): 

Definition 4.13 For F e SMUs=i and $ ^ V C lvs{T^{F)) the clause Cy is the 
prime implicate puc({Cu, £ F \ w £ V}) of D{F) according to Lemma j-l'A ^''^ 
w G lvs(T"'^(_F)) we furthermore set u^, ■— uc^- 



By Lemma 4.12 



Lemma 4.14 For F e SMU5=i holds prCo(D(F)) = {Cy | 7^ V C lvs(Ti(F))}. 

How precisely from V C lvs(T^(F)) the prime implicate Cy is constructed shows 
the following lemma: 

Lemma 4.15 Consider F e SMUs=i and (d ^ V C lvs(Ti(i^)). We have Cy = 
Uy U Py, UytlPv ^ 0, where 

1. Uy := {w^, I w e V}, and 

2. Py := puc(i^') for F' := {C^ ■ w € V} as given in Lemma [^.4 that is, Py is 
the set of literals x such that V D lvs(T2,) 7^ and V D Ivs(r^) = 0. 

Example 4.16 Consider the clause-set 

F {{vi,V2},{vi,V2},{vT, V3},{vT,V3}} e SMUs=i 
corresponding to the tree 

Vl 
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with the doped clause-set 



^(P) = {{V1,V2, Ml}, {Wl, U2, W2}, {V1,V3,U3}, {V1,V3,U4}}. 



Now consider the set V :— {1,3}. According to Definition we have that Cy — 
puc({{ui, W2, ui}, {tJT, Ws, U3}}) — {w2, W3, Ml, Its}. By Lemma 4-lt we have that 
Cv = Uv^ Pv, where Uy = {wi, U3} and Py — puc({{wi, W2}, {vT, V3}} = {^2, v^}. 
Note that for both x £ {v2, V3} = Py we have that lvs(Tj.)ny 7^ and lYs{T^)nV = 
0, but we do not have this for x G lit(-F') \ {^2, V3}. 

The hardness of F as well as D(F) is the Horton-Strahler number of T^{F): 



Lemma 4.17 Consider F e SMUs=i, 
hd(F) = hd(D(F)) = k. 



id let k := hs(Ti(F)). Then 



Proof: Let T T^{F). First we show hd(i^) = k. We have hd(F) < k, since T is 
by definition of F = (T) already a resolution tree (when extending the labelling 
of leaves to all nodes), deriving _L from F. To show hd(i^) > k, we use Lemma 
2.S with C := SMUs=i and h{F) := hs(T^(F)). Based on Lemma 4.4, we consider 
the effect on the Horton-Strahler number of assigning a truth value to one variable 
V G var(F). Let w G nds(T) be the (inner) node labelled with v, and let Tq,T^ be 
the left resp. right subtree hanging at w. Now the effect of assigning e G {0,1} to v 
is to replace T^, with T^. Let be the (whole) tree obtained by assigning e to v, 
that is, := T'^{{v ^ e) * F). If hs(To"') = hs{T^), then we have hs{T^) > k - 1, 
since at most one increase of the Horton-Strahler number for subtrees is missed out 
now. Otherwise we have hs(To) = hs(r) or hs(ri) = hs(T), since removal of the 
subtree with the smaller Horton-Strahler number has no influence on the Horton- 
Strahler number of the whole tree. So altogether Lemma ^.8| is applicable, which 
concludes the proof of hd(F) = k. 



For showing hd(D(F)) — k we use Lemma 3.20: so consider F' C F and ip G 
VASS with ip* F' e USAT, let F" * F' , and we have to show hd(F") < k. 
W.l.o.g. var((^) C var(F'). By Corollary ^ we have that (p * F £ SMUg^i, and 
thus ip* F = F" must hold, and hd(F") = hs{T'^{F")) (by the first part). By 
Lemma 4.4, T^(F") results from T by a sequence of removing subtrees, and it is 
easy to see, that thus hs{T^{F")) < k holds. □ 

We summarise what we have learned about D(F) for F G SA4Us=i: 

Theorem 4.18 Consider F G SMUs=i- 

1. For each clause-set F' equivalent to D(_F) there is an injection i : D(F) — > F' 
with VC G D(F) : C C i{C) (by Lemma 



2. D(F) is a total mps (by Lemma [j.^ together with Theorem 4-l\ )- 



3. The prime implicates ofT>{F) are given by Lemmas 4-15. 



4. hd(D(F)) = hs(Ti(F)) (by Lemma 4.17). 



5 Lower bounds 



This section proves the main result of this article. Theorem ^.13 , which exhibits for 



every fc > sequences (F^"'"^)„gN of small clause-sets of hardness fc-f 1, where every 
equivalent clause-set of hardness k (indeed of w-hardness k) is of exponential size. 
In this way we show that the UCk hierarchy is useful, i.e., equivalent clause-sets 
with higher hardness can be substantially shorter. These F^~^-^ are doped versions 
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of clause-sets from SA4Us=i (recall Theorem 4. IS), which are "extremal", that is, 
their underlying trees T^{F^^^) are for given Horton-Strahler number fc + 1 and 
height n as large as possible. 

The organisation of this section is as follows: In Subsection 5.1 the main tool 
for showing size-lower-bounds for equivalent clause-sets of a given (w-)hardness 
is established in Theorem 5.4. Subsection 5.2 introduces the "extremal trees". 



Subsection 5.3 shows the main lower bound in Theorem 5.12, and applies it to show 
the separation Theorem 5.13. 



5.1 Trigger hypergraphs 



A hypergraph is a pair G = {V,E), where is a set (of "vertices") and E C F{V) 
(the set of hyperedges), where one uses V{G) := V and E{G) :— E. A transversal 
of a hypergraph G is a set T C V{G) such that for ah E G E{G) holds THE 
The minimum size of a transversal is denoted by t{G), the transversal number. 
And let v{G) be the matching number of G, the maximum number of pairwise 
disjoint hyperedges. Obviously we have t(G) > v{G) for all hypergraphs G. 

Definition 5.1 Consider fc G No and F G CCS. The trigger hypergraph Tk{F) 
is the hypergraph with the prime implicates of F as its vertices, and for every prime 
implicate C of F a hyperedge E^ . The hyperedge E^ contains all prime implicates 
C G prcQ(i<") which are not satisfied by ipc and yield a clause of size at most k 
under ipc ■ That is, 



'PTCq{F), and 



F V{n{F)) 

2. E{n{F)) :-{i?^|GGprCo(F)}, 
where E% - {C G prCo(F) | G' n G = A |G' \ G| < fc}. 



Note that the trigger hypergraph of F G CCS depends only on the underlying 
boolean function of F, and thus for every equivalent F' we have Tk{F') = Tk{F). 

Example 5.2 Consider the clause-set 

F ■■= { {vi,V^,lH},{v2,V3,lH},{v2,V^,V4},{v^,V3,Vi},{vi,V3,V4},{vi,V2} }■ 



Oi 



C2 



C3 



Ci 



Ce 



As shown in Example 8.2 of ^31, \32J we have prCQ(_F) = F. The trigger hyper- 
graph Tq(F) is (as always) the hypergraph with all singleton sets, i.e., E(To{F)) = 
I {G i} , . . . , {Cq} ] . The hypergraphs Tk{F) fork G {1,2} are represented by Figures 



& 



G4 ^ 



G 



gQ 





Ce 

o 



Ci Si 



Figure 2: T2{F) 



Figure 1: Ti{F) 
To interpret the diagrams: 

1. An arrow from a clause C to a clause D represents that C G E^. 

2. A dotted arrow from C to D represents that \D\C\ > k (so C F^), but 
C O D — (d, and thus for some large enough k' > k we will have C G . 
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3. No arrow between C and D indicates that C H D ^ $ (i.e., for all k' we have 
C andD ^ E^). 

4. The size of a hyperedge is the in- degree of the vertex D. 

Consider E^ — {Ce} and E'^ = {Ci, C2, C3, C5, Ce}. As we will see in Lemma 



5.5, therefore every F' C F equivalent to F such that F' € UCi must have Cq € F' . 
However, Eq^ contains more clauses than Eq^, and for example F\ {Cq} G UC2 \ 
lACi as shown in Example 8.2 of /pi], Using the above diagrammatic notation, 
we can also see that for all k' > 2 we have Tk'{F) = T2{F), as there are no dotted 
lines for T2{F) (i.e., no clauses C and D such that \D\C\ > 2 but C H D — 

Lemma 5.3 Consider fc G Nq and F G CCS with whd(i^) < k. Then there is a 
clause-set F' such that 

1. F' Q \)vCq{F) and F' is equivalent to F ; 

2. there is an injection i : F' ^ F such that ^ C ^ F' : C Q i{C); 

3. whd(F') < k; 

4-. F' is a transversal ofTk{F). 

Proof: Obtain F' from F by choosing for every C ^ F some C" G prcQ(F) with 
C" C C. Then the first two properties are obvious, while Property || foUows from 
Part 1 of Lemma 6.1 in Assume that F' is not a transversal of Tk{F), that is, 
there is C G prCo(F) with F' (lE^ =0. Then (pc * F' £ US AT, but every clause 
has length strictly greater than fc, and thus /c-resolution does not derive _L from 
ipc * F' , contradicting whd(i^') <k. □ 

Directly from Lemma |5.3| follows: 

Theorem 5.4 For fc G Nq and F e WCk we have c{F) > T{Tk{F)). 



5.2 Extremal trees 

For a given hardness fc > 1 we need to construct (full binary) trees which are as 
large as possible; this is achieved by specifying the height, and using trees which 
are "filled up" completely for the given parameter values: 

Definition 5.5 A pair (fc, h) G Nq with h > k and k = 0^h = Ois called an 
allowed parameter pair. For an allowed parameter pair (fc, h) a full binary tree 
T is called an extremal tree of Horton-Strahler number k and height h if 

1. hs(r) = k, ht(T) = h; 

2. for all T with hs(T') < fc and ht(T') < h we have nds(r') < nds(T). 

We denote the set of all extremal trees with Horton-Strahler number fc and height h 
by HS(A;,/i). 

Note that for allowed parameter pairs (fc, h) we have fc = /i = 0. Extremal 
trees are easily characterised and constructed as follows: 

1. HS(0, 0) contains only the trivial tree (with one node). 

2. HS(1, h) for /i G N consists exactly of the full binary trees T with hs(T) = 1 
and ht(T) — h, which can also be characterised as those full binary trees T 
with ht(T) — h such that every node has at least one child which is a leaf. 
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3. For fc > 2 and /i > fc we have T € E.S{k,h) iS T has the left subtree To and 
the right subtree Ti, and there is e € {0, 1} with e HS(fc — l,h — 1) and 
Ti_e e HS(min(fc, h-l),h- 1). 

Lemma 5.6 For all allowed parameter pair {k, h) we have HS(fc, h) ^ 0. 

The unique elements of HS(fc, k) for fc £ No are the perfect binary trees of height fc, 
which arc the smallest binary trees of Horton-Strahler number k. 

Lemma 5.7 For an allowed parameter pair {k,h) and for T G HS(A:,/i) we have 
#lvs(r) a(fc, h) := J^Lo (t) • have a{k, h) = e(/i'=) for fixed k. 

Proof: For fc < 1 we have a(0, 0) = 1 and a{l^h) — 1 + h. which are obviously 
correct. Now consider k > 2. By induction hypothesis we get 

#nds(T) = a(fc - 1, /i - 1) + a(min(A:, h~l),h~ 1). 

li h = k, then a{k, h) = 2^ (for all k), and we get #nds(r) = a(fc - 1, fc - 1) + 
a{k - 1, fc - 1) = 2 • 2''"^ =2'^ = a{k, k). Otherwise we have 

#nds(r) = a(fc - 1, /i - 1) + a(fc, h - I) = 

(^>i:(M:(:)"<-). 

□ 



Example 5.8 Consider the following labelled binary tree T : 




Applying the recursive construction/ characterisation we see T € HS(2,3) 
simple counting we see that T has 7 leaves, in agreement with Lemma 
X]j=o (j) ~ (o) ^' (i) "'^ (2) =1 + 3 + 3 = 7. Assuming that of the two subtrees at 
an inner node, the left subtree has Horton-Strahler numbers as least as big as the 
right subtree, the idea is that the sum runs over the number j of right turns in a 
path from the root to the leaves. In the above tree T , the number of right turns is 
indicated as an index to the leaf-name. If the Horton-Strahler number is k, with at 
most k right-turns we must be able to reach every leaf. 



We summarise the additional knowledge over Theorem 4.18 



Lemma 5.9 Consider an allowed parameter pair (fc, h) and T e HS(fc, h), and let 
F := Fi(r). 
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1. n{V)[F)) = 2 • a{k, h) - 1 (= Q{h^) for fixed k ). 

2. c(D(i^)) = a{k,h) (= Q{h^) for fixed k). 

3. B{F) e UCk\UCk-i (for k>l). 



In Theorem 5.13 we will see that these D(i^) from Lemma 5.9 do not have short 



equivalent clause-sets of hardness k — 1. 

5.3 Exponential lower bounds on "better" equivalent clause- 
sets 

The depth of a node w in a rooted tree T, denoted by dxiw) E Nq, is the length 
of the path from the root of T to w. Recall that two sets A, B are incomparable iff 
A B and B % A. Furthermore we call two sets A, B incomparable on a set C if 
the sets AO C and B O C are incomparable. 

Definition 5.10 Consider a full binary tree T , where every leaf has depth at least 
fc + 1. Consider furthermore 7^ V, 1^' C Ivs(r). Then V and V' are depth-k- 
incomparable for TifV and V' are incomparable on Ivs(T^) for all w G nds(T') 
with driw) = k. 

Note that for all allowed parameter pairs (fc, h) and T E HS(fc, h) every leaf has 
depth at least fc. 

Lemma 5.11 Consider k E No, T E Ti, and ^ Vo, Vi C Ivs(r) which are depth- 



k -incomparable for T. Let F F^(T) and consider Tk{F) (recall Definition 5.1) 
Then the hyperedges , E^^^ are disjoint (recall Definition 

Proof: Assume that E^^ , E^^ are not disjoint; thus there is 7^ y C Ivs(T) with 
Cv e E^^^ n E^^^ . We will show that there is e G {0, 1} with |Cy \ CyJ > fc + 1, 
which contradicts the definition of T^ (F) . 

Since V ^, there is w E V. Consider the first fc + 1 nodes wi, . . . , Wk+i on 
the path from the root to w. Let w[ be the child of different from Wi for 

i E {2, . . . , fc + 1}, and let Ti := TIm^^j for i E {1, . . . , fc}, while Tk+i :— T^^+i; see 
Figure |[ We show that each of Ti, . . . , Tk+i contributes at least two unique literals 
to \Cv \Cvo\ + \Cv \ CvJ, so that we get |Cy \Cv„\ + |Cy \ CyJ > (fc + 1) • 2, from 
which follows that there is e g {0, 1} with jCy \ Cv^ | > fc + 1 as claimed. 

Due to the depth- k-incomparability of V, V' , for each i E {1, . . . ,k + l} and each 
£ E {0, 1} there are nodes uf with E {\va{Ti) n I4) \ V^. We have two cases now: 

I If vf E V, then w^,. E Cv \ Cy_. 

II If vf ^ V, then consider the first node v on the path from uf to the root 
such that for the other child v' of v, not on that path to the root, holds 
lvs(Tt,') n F 7^ 0: now for the literal x labelling the edge from v to v' we have 
X E Cv \ Cv^ ■ Note that v is below or equal to Wi (due to w E V). 

For each e E {0, 1}, the literals collected in Cv \ Cv^ from these fc -t- 1 sources do 
not coincide, due to the pairwise node-disjointness of the trees Ti, . . . , Tk+i- □ 



Theorem 5.12 Consider k E Nq, h > k + 1, and T E HS(fc + l,h); let F := 
D(Fi(T)) and m := a{l,h-k) = l + h-k. We have 

/ m \ 12™ 2'' 

^<W).(^„j)>^^^e,^,, 
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Figure 3: Illustration of sub-trees Ti, . . . , Tk+i- 



where the second inequality assumes h > k + 5, while the Q-estimation assumes fixed 
k. 

Proof: For every S C P(lvs(r)) with 0^5, such that every two different elem ents 
of S are depth- fc-incomparable for T, we have v{Tk{F)) > \S\ by Lemma ^j.llj We 
can actually determine the maximal size of such an 5, which is M := (™,), where 
m' := as follows. Let T := {Ty, : w € nds(T) A driw) = k}; note that for 

T', T" e T with T' ^ T" we have lvs(T')nlvs(r") = 0. Choose Tq e T with minimal 
#lvs(ro); by Lemma |3 we have #lvs(To) = m. Let := {V n Ivs(ro) -.V € S}. 
Then So is an antichain (i.e., the elements of Sq are pairwise incomparable) and 
\So\ = \S\. By Sperner's Theorem (fs^) holds |S'o| < M, and this upper bound 
M is realised, just observing the antichain-condition, by choosing for Sq the set 
(ivsjTo)^ of subsets of Ivs(ro) of size m'. This construction of Sq can be extended 
to a construction of S (of the same size) by choosing for each T' e T an injection 

jT' : So -J> ('"mT-*) and defining S := {Ut'st iT'(^^)}yeSo- The given estimation 
of M follows from Stirling's approximation. □ 



We are now able to state the main result of this article, proving Conjecture 
1.1 from ^ that UCk, and indeed also WCk, is a proper hierarchy of boolean 
functions regarding polysize representations without new variables (see Subsection 
3.1 for a discussion of "representations" in general): 



Theorem 5.13 Consider k e No. For h > k + 1 choose one Th e HS(fc + l,/i) 
(note there is up to left-right swaps exactly one element in HS(fc + l,h)), and let 
Fh := T){Th). Consider the sequence {Fh)h>k+i- 



1. By Lemma \5_ 

Fh e UCk+i 



we have n{Fh) = Q{h^+^) as well as c{Fh) = 9(/i''+^), and 
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2. Consider a sequence [F'^)h>k+i of clause-sets with Fj^ equivalent to Fh, such 



that Fl^ e WCfc. By Theorems \5.13i , \5.4 we have c{Fl^) = f^(^) 



We conjecture that Theorem 5.13 can be strengthened by inchiding the PC- 
hierarchy in the following way: 



Conjecture 5.14 For every fc G No there exists a sequence (-Fn)neN of clause- 
sets in VCk+i, where for convenience we assume n(Fn) = n for all n, such that 
{t{Fn))n(ifi is polynomially hounded, and such that for every sequence (F^) 
WCk, where for all ?i G N holds that F,' is equivalent to Fn, the sequence (£(i^^))„gN 
is not polynomially bounded. 



6 Analysing the Tseitin translation 

Wc now turn to upper bounds, investigating cases where the Tseitin translation 
yields representations in UC. We consider two main cases: translating a DNF into a 
CNF, or translating an XOR-circuit. In Subsection we discuss the general notion 



of "CNF representation". In Subsection 6.2 we discuss translating DNF into CNF 



which we consider as a map from CCS to CCS, and which we call the "canonical 
translation" . Lemma |6.10| shows that the hardness of canonical translation results 



can be arbitrarily high. On the other hand. Lemma 3.11 show s that for hitting 
DNF the canonical translation result is in UC, a nd Th eorem 6.13 applies this to our 
lower bound examples, in contrast to Theorem 5.13| (so we see that new variables 



here help). Finally by using only the n ecessary direction of the equivalences in 



the Tseitin translation, in Lemma 3.16 we see that for this "reduced canonical 



translation" the result is always in UC. Wc conclude by discussing representations 



of XOR-clause-sets in Subsection 5.3 



6.1 CNF-representations 

In Subsections 1.4 and 9.2 of ^ we discussed representations of boolean func- 
tions in general. The most general notion useful in the SAT-context seems to allow 
existentially quantified new variables, which yields the following basic definition: 

Definition 6.1 A CNF-representation of F ^ CCS (as CNF) is a clause-set 
F' G CCS with var(i^) C var(i^') such that the satisfying assignments of F' (as 
CNF) projected to var(F) are precisely the satisfying assignments of F. 

Note that the CNF-representations F' of F without new variables, i.e., with var(F') = 
var(F), are precisely the clause-sets F' equivalent to F with vai(F') = var(F). We 
have conjectured in j3^, ^ (Conjecture 9.4) that Theorem 3.13| (and Conjecture 



^.14| ) also holds when allowing new variables, which in this context we can rephrase 
as follows, also extending the conjecture by including WCk (see Conjecture for 
a further strengthening): 

Conjecture 6.2 For every k £ Nq there exists a sequence (-F'„)„gN of clause-sets, 
such that there is a sequence (-F/JkgN; where each F^ is a CNF-representation of 
Fn, we have F,' G VCk+i, and where £{F^) is polynomial in n, but where there is 
no such sequence {Fl^)neN with F" G WCfe. 

Our basic condition for a "good" representation F' oi F £ CCS is that F' G UCk 
holds for some "low" k (a constant if F depends on parameters). This is what we call 
the absolute condition — regarding the requirement of detecting unsatisfiability 
of ip * F' for some partial assignment we do not distinguish between original 
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variables (those in var(F)) and new variables (those in var(F') \ var(F)), that is, 
var((^) C var(_F') is considered. If we consider only var((y9) C var(i^), then we obtain 
the relative condition: 

Definition 6.3 For F € CCS and V CVA the relative hardness hd^(F) e No 

is defined as the minimum k € No such that for all partial assignments Lp G VASS 
with var((y9) C V and ip * F £ US AT we have rk{(p * F) ^ {-L}- And the relative 
w-hardness whd^(i^) e No is defined as the minimum fc G No such that for all 
partial assignments Lp G VASS with var((/3) C V and ip * F £ US AT we have that 
k-resolution derives _L from ip * F. 

Obviously hd^(F) < hd(F) and hd™'(^)(i^) = hd(F), as weh as whd^(i^) < 
whd(F) and whd'"'''^^^ (F) = whd(i^). Having a representation F' of F with 
hd™"^*-^' (F') < 1 is closely related to what is typically called "maintaining arc 
consistency" ; it would be precisely that if we would use p-hardness instead of hard- 
ness, while using (only) hardness is a certain weakening. Having hd™'^^^''(F') — 
here is equivalent to prCo(F) C F' , and thus for hardness new variables are not 
helpful, neither for the relative nor the absolute condition. 

Conjecture ^ is false for relative hardness, since regarding relative hardness the 
hierarchy collapses to the first level: we will present the details in a future paper, 
but they are not difficult — since there are no conditions on the new variables, 
the r/c-computations for fc > 1 can be encoded into CNF, only relying on ri. Such 
an encoding is an extension of Theorem 1 in using similar techniques. More 
involved is the collapse of the WCfc-hierarchy to the first level regarding relative 
hardness; we believe we can also show this, but we better formulate it explicitly as 
a conjecture: 

Conjecture 6.4 For every fc > 1 there is a polytime function t{F,V), which takes 
a clause-set F and a finite set V of variables as arguments, such that in case of 
whd^(F) < k the output t{F,V) is a representation of F with whd {t{F)) < 1. 

Note that for aU F G CCS and V C VA holds whd^(F) < 1 ^ hd^(F) < 1. The 
collapse of all considered hierarchies to their first level, when considering the relative 
condition, is for us a major argument in favour of the absolute condition: Within the 
class of representations of relative hardness at most 1 (when using new variables) 
there is a lot of structure, and many representations fulfil absolute conditions; some 
basic examples follow in the remainder of this section. 

6.2 The canonical translation 

If for the F G CCS to be represented we have an equivalent DNF G G CCS, then 
we can apply the Tseitin translation, using one new variable v to express one DNF- 
clause, i.e., using prCo(w -O- A^ec-^) C € G. The details are as follows. 

We assume that an injection vet : {(F, C) | F G CCS A C G F} — > VA is given, 
yielding the variables of the canonical translation, such that var(F)n{vct(F, C)}cg_F = 
holds for all F G CCS (that is, these variables are new for F). We write 
vct^ := vct(F,C). 

Definition 6.5 The map ct : CCS CCS is defined for F G CCS as 

ct(F) := { {vctg, x}:CeFAxeC}u{ {vctg} U C : C G F }u 

{{vct^jcep}. 
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The first two types of clauses are the prime implicates of the boolean functions 
vctp o Aksc ^' '^hile the last type (a long, single clause) says that one of the 
(DNF-)clauses from F must be true. To emphasise: the map ct is a map from clause- 
sets to clause-sets, where the (implicit) interpretation of the input and the output is 
different: the input F € CCS is interpreted as DNF, while the output ct(F) e CCS 
is interpreted as CNF. Some basic properties of the canonical translation: 

1. The sizes of the canonical translation for F E CCS are given by 



3. Consider (p e VASS with va,i{(p) C var(i^), and treat F as a multi-clause-set, 
that is, if application of (p to different non-satisfied clauses from F makes 
these clauses equal, then no contractions are performed. Then the canonical 
translation behaves homomorphic regarding application of partial assignments 
in the sense that ct(^ * F) (recall that we need to treat F here as a DNF) is 
isomorphic to {(pUtp)* ct(F), where ip sets those vctp to for which there is 
X £ C with 'p{x) — 0. 

Example 6.6 We give some simple examples for canonical translations. 
1. For F := {{vi\ , 1S\ we have 



2 



(a) n{ct{F)) = n{F) + c{F) 

(b) c(ct(F)) = 1 + c{F) + e{F) for F ^ {_L}. 

ct(T) = {±} and ct({±}) = {{vct|^a}. 



ct(F) = { {vcty,z;i} {vcty,tJr} , {vct^} , {vct^ vctj^} }. 





ct{F) ^ {{vctp\vi},{vctp\v2},{vctp\v3},{vctp\vi,V2,V3}, 



(f 1 AV2 A V3) -f-^- vctp. 



{vctp\vi},{vctp\v2},{vctp'' ,V4},{yctp\vi,V2,V4}, { vct^\ vet p^} }. 
^ ' ' 



3. Applying Lp 



(v3 — > 1,W4 — > 1) to the last example (Case ^ yields 



if * ct(F) 



{{vet/ ,Vl},{vCtp\v2}, {vctp' ,Vl,V2}, 

{vi A V2) -H- vctp^ 



{vct^^^;l},{vct^^^;2},{vct^^^J^,^^}, {vct^Svct^^} }. 

AV2) -ir^ vet ^2 VCt^i V VCt^= 
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4-. Applying (p :— {v^ — !■ 0) to Case |^ yields 



^*ct(F) = {{vct^\wi},{vct^\i;2},{vctgi}, 

" v ' 

{vctf,vi}, { vct^^ , t>2 } , {vctf , W4 } , { vct^^ iJT, i^} , { vct^^ , vct^^ } } . 

(f 1 A t>2 A 'i;4) -H- vct^= vct^^ V vct^= 

5. While applying ip := {v^ — ^ 0) and '0 := {vct^^ — > 0} to Case ^ yields 

{(fUip)* ct{F) = 

{{vctp' ,vi}, { vct^^ , W2 } , { vct^^ U4 } , { vct^= ,1)1,1)^,^}, {yctp^ }}. 

A V2 A t>4) -f-)- vct^^ vct^= 

/n Case ^ we see a??, example of why for the canonical translation to have the ho- 
momorphism property we must consider F as a multi-clause-set. That is, Tp * F = 
{{vi,V2}}, and so ip * ct(_F) ^ ct(^* F): the clause {wi,W2} is represented by two 
separate new variables in (p * ct(F) compared to only one in ct(^ * F). 

In CaseQ we see an example where for the homomorphism property of the canon- 
ical translation not just renaming, but also some unit-clause elimination is needed. 
These unit-clauses are added in Case ^ extending the assignment to falsify the new 
variable vct^^ corresponding to falsified DNF-clause Ci. 

Lemma 6.7 Consider F e CCS (as CNF) and an equivalent DNF-clause-set G € 
CCS. Then ct{G) is a CNF-representation of F . 

Proof: ct(F) is true iff at least one of its vet- variables is set to true, which is 
precisely the case iff at least one of DNF-clauses of G is satisfied, where the (DNF- 
) clauses of G cover precisely the satisfying assignments of F. □ 



Lemma 6.8 For F e CCS we have hd™''*^'(ct(i^)) < 1 (recall Definition [Qj) 



Proof: Consider ip £ VASS with var(iy9) C var(F) and ip * ct{F) e US AT. Then 
all DNF-clauses of F are falsified, which yields via UCP that all vet-variables are 
set to false, and thus ri((p * ct(i^)) = {-L}- □ 
In 0] a more general version of Lemma |6.8| is proven, showing that for all "smooth" 
DNNFs (Disjoint Negation Normal Form) the Tseitin translation yields a clause- 
set which maintains arc-consistency via UCP (a somewhat stronger property than 
relative hardness < 1 as in Lemma |6^ ) .p| That Lemma 6.8 only establishes the 



relative condition, and not the absolute one, is due to the fact that setting vct- 
variables to can pose arbitrarily hard conditions; a concrete example follows, 



while a more drastic general construction is given in Lemma 3.1C. However the 
difficulties can be overcome, by just removing them: In Lemma 3.16 we will see 
that when dropping the part of the canonical translation which gives meaning to 
setting vet- variables to 0, that then we actually can establish the absolute condition. 



There is a mistake in ||7| in that it claims that the Tseitin translation of all DNNFs maintain 
arc-consistency via UCP, however this is shown only for smooth DNNFs as confirmed by George 
Katirelos via e-mail in January 2012. 
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Example 6.9 Consider the following clause-set with variables xi, . . . , x^: 

F := { {xi,X2, X3} , {xi,X2, X4} , {xi,X2,X5} }. 

Ci C2 C3 

The canonical translation is 

ct{F) = {{xi,vct^^},{a;2,vct^i},{x3,vct^i}},{xr,x^,xi,vct^i}U 
. ■' 

{ {a;i , vct^" } , {a;2 , vct^" } , {x4 , vct^^ } } , { xT, ^2 , ^4 , vct^" } U 
. ■' 

^rctp^ ^ (xi A X2 A X4,) 

{ {xi , vct^' } , {2:2 , vct^' } , {a;5 , vct^-' } } , {xT, x^, xE, vct^" } U 

v ' 

vct^^ -f-)- (xi A X2 A X5) 

{{vct^Svct^%vct^^}} . 

v ' 

(vct^^ V vct^^ V vct^^) 

Applying the partial assignment tp := {x^ — 1, X4 — 1, X5 ^ l,vct^^ — >■ 0) yields 

F' := if * ct{F) = {{a;i,vct^i},{a;2,vct^i}},{xr,3^,vct^^}U 
^ ^ ^ 

vct^^ -e)- {xi A X2) 

{{xi, VCt^"}, {X2, vct^^}}, {xj, x^, vct^"} U 

" V ' 

vct^^ -f^ {xi A X2) 

{{xT,x^}} U {{vctgSvctg^}} . 
-.(aJiAaJa) (vct^^ V vctg^) 

We have F' G lASAT , where F' has no unit-clauses, whence hd(F') > 2, and so 
ct(i^) <^ Z^Ci. 

Lemma 6.10 Consider F e CCS. Let v G Vyl\var(i^) and F' := FU{{v}}. Then 
hd(ct(F')) > M{F). 

Proof: Let (p := (vct^, : C £ F) U {v,vct]^} -> 1). Then (p * ct(F') = F" := 
{C : C G F}, where hd(ct(F')) > hd(F") = hd(F). □ 

If we do not have just a DNF, but a "disjoint" or "orthogonal" DNF (see Section 
1.6 and Chapter 7 in [Q), which are as clause-sets precisely the hitting clause-sets, 
then we obtain absolute hardness 1: 

Lemma 6.11 For F € "HIT we have ct(F) G UC, where ct(F) is a representation 
of the DNF-clause-set F . 

Proof: Consider a partial assignment if such that f) * ct(F) is unsatisfiable. Since 
"HIT is stable under application of partial assignments, and furthermore here no 
contractions take place, w.l.o.g. we can assume that var((y9) n var(F) =0. If sets 
two or more vct-variables to true, then UCP yields a contradiction, since any two 
clauses from F clash. If tf would set precisely one vet- variable to true, then we had 
If * ct(F) = T. So assume that f sets no vet- variable to true. Now if must set all 
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vct-variables to false, since, as already mentioned, just setting one vct-variable to 
true satisfies ct(F). And thus _L e ct(F). □ 



We now want to show that via the canonical translation we can obtain repre- 
sentations of D(F) for F G UHIT- For this we show first that all such D(F) have 
short hitting DNF clause-sets. For F € CCS let #sat(F) e Ng denote the number 
of satisfying assignments for F, that is, #sat(F) — |DNF(i^)|. 

Lemma 6.12 Consider F e WHIT , and let m :— n{F) + c{F). 

1. #sat(D(i^)) = 2"-i. 

2. Let F' := { CU{uc} \C (^F];by definition we have F' G UTT- Furthermore 
#sat(i^') = 2™-i. 

3. F' as a DNF-clause-set is equivalent to the CNF-clause-set T){F). 

Proof: We have Ece_F2-l'^l = 1 (see ||). Thus EceD(_F} 2-l<^l = \, which 
proves Part ^ (note m — n(D(i^)) and D(i^) € HIT). Part || follows from Part 
^ since F' results from D{F) by flipping literals. Finally we consider Part ^. All 
elements of F' , as DNF-clauses (i.e., conjunctions of literals), represent satisfying 
assignments for D(F), that is, for all C e F' and D e D(F) we have Cn £> 7^ 0. By 
Part |[ precisely half of the total assignments of DNF-clause-set F' are falsifying, 
and thus precisely half of the total assignments are satisfying: since this is the same 
number as the satisfying assignments of D(F), we obtain that the DNF-clause-set 
F' is equivalent to the CNF-clause-set F. □ 



By Lemma 3.12 and Lemma 3.11 we obtain 



Theorem 6.13 For F G WHIT there is a short CNF-representation (using new 
variables) ofD{F) in UC, namely F' ct({CU {uc} ■ C G F}) G UC, where: 

1. n{F') = n{F) + 2c{F). 

2. c{F') = l + 2c{F)+e{F). 

This applies especially for F G SA4Us=i C WHIT- 

Finally we show that when relaxing the canonical translation, using only the 
necessary direction of the constitutive equivalences, then we actually obtain repre- 
sentations in UC for every DNF-clause-set: 



Definition 6.14 The map ct : CCS — )• CCS ("reduced canonical translation") i. 



ts 



defined for F G CCS as ct-{F) := {{vctg, x} : C e F A x e C} U {{vcI^Icgf}- 
Note that all clauses of ct"(F) are binary except of the long clause stating that one 



of the vct-variables must become true. With the same proof as Lemma 6.7 we get: 



Lemma 6.15 Consider F G CCS (as CNF) and an equivalent DNF-clause-set G G 
CCS. Then ct^{G) is a CNF-representation of F . 



Lemma 6.16 For F G CCS we have ct"(F) G UC (i.e., ct" : CCS UC). 

Proof: For the sake of contradiction consider a partial assignment Lp such that 
F' := Ti{ip * ct"(i^)) G US AT but F' ^ {_L}. Note that F' contains neither _L nor 
a unit-clause, and thus F' is a subset of ct"(F) except of the possibly shortened 
or satisfied long vct-clause. If F' contains no new variables, then thus F' = T, a 
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contradiction. So there exists some C G F such that vct^ occurs in F'. Consider 
the assignment (p', which sets vct^ and all a; e C to true, while setting all other 
(remaining) new variables to false: (p' satisfies F' , a contradiction. □ 



Example 6.17 We conclude our basic considerations of "canonical translations" 
by discussing "unique extension properties". A representation F' of F has the 
unique extension property ("uep") if for every total satisfying assignment of F there 
is exactly one extension to a satisfying assignment of F'. For every F e CCS the 
representation ct(F) ofF has the uep, since a variable vctp must be set to 1 precisely 
for those C G F which are satisfied by (f in the DNF-sense (i.e., Tp* {C} = {-L}^. 
On the other hand, the representation ct"(F) of F in general has not the uep: The 
total saiisfying assignments for ct{F) extending if are exactly those which set at 
least one of the variables vctp true for those C & F which are satisfied in the 
DNF-sense. 

A representation F' of F has the strong unique extension property if for every 
partial assignment with _L e Tp * F (i.e., ^p satisfied at least one of the DNF- 
clauses) there is exactly one extension on the new variables (alone) to a satisfying 
assignment of F' . For F € HIT the representation ct{F) of F has the strong uep, 
since the satisfying assignments given by the clauses of F are inconsistent with each 
other. 

6.3 XOR-clauses 

For the n-bit parity function a;i © • • • © .x„ = the uniqiic equivalent clause-set 
prcQ(a;i © • • • © x„ = 0) (unique since the prime implicates are not resolvable) has 
2"~^ clauses. We now show that a typical SAT translation of the n-bit parity 
function, using new variables yi (for i G {2,...,n — 1}) to denote the sum of the 
first i bits, is in UC. 

Lemma 6.18 Consider n > 3, literals a;i,...,x„ with different underlying vari- 
ables, and variables ?/2, • • • , Vn-i- Let F := F2U [[j^Z^ Pi) U P„, where 



1. P2 ■■= prco(a;i ® X2 = 2/2), 

2. Pi := prco(yi-i ® Xi = yi), 

3. Pn ■■= prco(y„-i © a;„ = 0). 

We have F e UC, and F represents prco(a;i ® • • • ® a;„ = 0). 

Proof: Assume for the sake of contradiction that F ^ UC. Thus there exists 

a partial assignment (p such that for F' := ri(i^ * F) wc have F' G US AT, but 
F' ^ {-L}- By definition F' has no clauses of size < 1 and is non-empty. Observe 
that setting any variable in Pj for z e {2, . . . , n — 1} yields a pair of binary clauses 
representing an equivalence or anti-equivalence between the two remaining variables. 
Also if Pj n P' 7^ for some i € {2, . . . , n — 1}, then we have P, C F', since all 
clauses of Pj contain all variables of Pj. Therefore we have F' = EU Uie/ Pi for 
some subset / C {2, . . . , n — 1}, where F is a set of clauses representing a chain of 
equalities and inequalities. Consider the assignment (/?':= (xj : i S /). We 
have (fi' * Pi = (p' * prCo(yi-i + Xi = yi) = prCo(2/i-i = yi); note that Xi is only in 
Pi, and so {xi — ^ 1) only touches Pj. So ip' * F' now contains only variable-disjoint 
chains of equivalences and anti-equivalences, each trivially satisfiable, yielding a 
contradiction. □ 



34 



Example 6.19 For n = 3 we get 

F ^ { {xi,X2,?7J},{a;i,x^,y2},{^,a;2,2/2},{^,^,y2}:{y2,53},{y2,a;3} }• 



Xx X-2 



y-2 



y2 © cca = 



A very interesting question is how much the (simple) Lemma 3.18 can be ex- 
tended, towards representing arbitrary systems of linear equations. It seems to 
us, that we do not have polysize representations with bounded hardness in the 
UC-framework: 



Conjecture 6.20 As usual, an "XOR-clause" is a (boolean) constraint of the form 
Xi (3 ■ ■ ■ ® Xn = for literals xi, . . . , Xn, which we just represent by the clause 
{xi, . . . , Xn} G CC. An "XOR-clause-set" F is a set of XOR-clauses, which is just 
represented by an ordinary clause-set F e CCS (with an alternative interpretation) . 
The conjecture now is that XOR-clause-sets do not have good representations with 
bounded hardness, not even when using relative hardness. That is, there is no 
fc S No and no polynomial p{x) such that for all clause-sets F £ CCS there exists 
a CNF-representation F' e CCS (possibly using new variables), taking F as an 
XOR-clause-set, with 1{F') < p{i{F)) and hd™''(^)(F') < k. 

Basic results for showing such a lower bound are obtained in . As we have already 
remarked (after Definition |6.3[ ), regarding relative hardness only k € {0,1} are of 
relevance (because we allow new variables), while regarding absolute hardness we 
conjecture that also with new variables we have a proper hierarchy ( Conj ect ure |6 . 2| ) . 

We conclude now our initial study on "good representations" by the basic ob- 
servations regarding the naive approach for translating XOR-clause-sets. 



7 Hardness under union 

When applied piecewise to a system of linear equations (with different auxiliary 



variables for each single equation), the translation from Lemma 6.18 does not yield 



a clause-set in LiC, as we show in Theorem 7.5. To facilitate the precise computation 



of the hardness of the union of two such XOR-clause-translations, we present two 
general tools for upper bounds on hardness and one for lower bounds. 

Lemma 7.1 Consider F e CCS and V C var(F). Let P be the set of partial 
assignments ip with var('i/;) = V . Then hd(i^) < \V\ + max^gp hd(?A * F). 

Proof: Consider a partial assignment (p with ip * F £ US AT; we have to show 
hd{ip * F) < \V\-\- max.0g p hd('0 * F). Build a resolution refutation of i^j * by first 
creating a splitting tree (possibly degenerated) on the variables of V; this splitting 
tree (a perfect binary tree) has height \V\, and at each of its leaves we have a 
clause-set (f*{ip*F) for some appropriate -ip G P. Thus at each leaf we can attach a 
splitting tree of Horton-Strahler number of hardness at most max^g p }id{ip*F), and 
from that (via the well-known correspondence of splitting trees and resolution trees; 
see ^ for details) we obtain a resolution tree fulfilling the desired hardness 
bound. □ 



Corollary 7.2 For Fi,F2 £ CCS holds hd(Fi U F2) < max(hd(Fi), hd(F2)) 
|var(Fi)nvar(i^2)|. 



Proof: Apply Lemma 7.1 with F := F1UF2 and V :— var(Fi)nvar(F2), and apply 
the general upper bound hd(Fi U F2) < max(hd(Fi), hd(i^2)) for variable-disjoint 
Fi,F2 (Lemma 15 in @). □ 
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Lemma 7.3 Consider a clause-set F e US AT and (arbitrary) literals x, y. Denote 
by Fx-^y £ USAT the result of replacing x by y and x by y in F , followed by 
removing clauses containing complementary literals. Then we have )\A{Fx^y) < 
hd(F) and w\vd{Fx^y) < whd(F). 

Proof: Consider T : F \- It is a well-known fact (and a simply exercise), that 
the substitution of y into x can be performed in T, obtaining T^^-y : F^^-y h _L. 
This is easiest to see by performing first the substitution with T itself, obtaining 
a tree T' which as a binary tree is identical to T, using "pseudo-clauses" with 
(possibly) complementary literals; the resolution rule for sets C, D of literals with 
x G C and x £ D allows to derive the clause (C \ {a;}) U (D \ {x}), where the 
resolution-variables are taken over from T. Now "tautological" clauses (containing 
complementary literals) can be cut off from T': from the root (labelled with _L) go 
to a first resolution step where the resolvent is non-tautological, while one of the 
parent clauses is tautological (note that not both parent clauses can be tautological) 
— the subtree with the tautological clause can now be cut off, obtaining a new 
pseudo- resolution tree where clauses only got (possibly) shorter (see Lemma 6.1, 
part 1, in |Q). Repeating this process we obtain T^^y as required. Obviously 
hs{Tx^y) < hs(T), and if in T for every resolution step at least one of the parent 
clauses has length at most k for some fixed (otherwise arbitrary) fc G Nq, then this 
also holds for Tx^-y. □ 



Example 7.4 The simplest example showing that for satisfiable clause-sets F (w- 
)hardness can be increased by substitution is given by F := {{x}, {y}} for var(2;) ^ 
var(y). Here hd(F) = 0, while Fx^^y = {{y}, {y}}, and thus hd(Fa;^j,) = 1. 

Theorem 7.5 For ?i > 3 consider the system 

Xi®X2®---®Xn = 

a;i © a;2 © • • • = 0. 



Let F := Fi U F2, where Fi is the translation of the first equation by Lemma 6.1^ , 
and F2 is the translation of the second equation, using different auxiliary variables 
(so n{F) ^ 2 ■ {n + {n - 2)) - n = 3n - A). We have F £ USAT with hd(F) = n. 



Proof: From Cor ollar y 7.2 and Lemma |6.18| we obtain hd(F) < n -t- 1. Better 



is to apply Lemma 7.1 with V := var({a;2, . . . ,Xn-i}). By definition we see that 
^Ij * F £ 2-CHS (i.e., all clauses have length at most two) for f/' with var(?/') = V . 
By Lemma 19 in ||§ we have hd(V' * i^) < 2, and thus hd(f') < (n - 2) + 2 = n. 
The lower bound is obtained by an application of Lemma [2.8| . Consider any literal 
X £ lit(-F'„), where the subscript in Fn — F makes explicit the dependency on n. 
Setting X to true or false results either in an equivalence or in an anti-equivalence. 
Propagating this (anti-)equivalence yields a clause-set F' isomorphic to -F„_i, where 
by Lemma this propagation does not increase hardness, so we have hd((a:: — >■ 
1) * L'n) > hd(F') = hd(F„_i). The argumentation can be trivially extended for 



n £ {0, 1, 2}, and so by Lemma 2.8 we get hd(i^) >n. □ 
If Fx^Fi in Theorem [7.5| were the direct translations (with hd(Fi) = hd(i^2) = 0), 
then hd(i^) — n would follow easily with Lemma 3.18 in |Q, since then F would 
be simply the clause-set with all 2" clauses of length n. Of course, regarding a good 
translation of the system from Theorem |7.5| we can just use {J-}, easily computed by 
preprocessing — however the content of Conjecture |6.20| is, that no preprocessing is 
powerful enough to handle arbitrary (satisfiable!) systems of linear equations (over 
the two-element field). 
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8 Conclusion and open problems 



We have discussed three hierarchies VCk, UCk and WCk of target classes for "good" 
SAT representations. We showed that each level of UCk+i contains clause-sets 
without equivalent short clause-sets in WCk- And we showed conditions under which 
the Tseitin translation produces translations in UC. We conclude by directions for 
future research. 

8.1 Strictness of hierarchies 

A fundamental question is the strictness of the hierarchies VCk, UCk and WCk in 



each of the dimensions. In Theorem 5.13 we have shown w.r.t. logical equivalence 



(i.e., without new variables) that the UCk and WCk hierarchies are strict. It fol- 
lows that for VCk at least every second level yields an advance regarding logical 
equivalence (and polysize). This offer evidence that these hierarchies are useful, for 
example using failed literal reduction can allow one to use exponentially smaller 
SAT translations. Open are the questions of strictness for the hierarchies allowing 
new variables. To summarise, the main conjectures are: 



1. Conjecture |5.14| strengthens Theorem |5.13| by taking the PC-hierarchy into 
account. 



2. Conjecture 6.2 roughly says that all of VCk, UCk and WCk are strict (similar 



to Theorem 5.13), when allowing new variables under the absolute condition. 



3. Conjecture 6.4 says that the WCk hierarchy coUapses to WCi (and thus to 



VCi), when allowing new variables under the relative condition. 
8.2 Separating the hierarchies 

For stating our three main conjectures relating the three hierarchies, we use the 
following notions: 

• A sequence (F^)„gN is called a CNF-representation of {Fn)nefi if for all n e N 
the clause-set is a CNF-representation of F„ . 

• A polysize sequence in C C CCS is a sequence (i^„)„gN with F„ E C for 
all n G N, such that (£(F„))„gN is polynomially bounded (i.e., there is a 
polynomial p{x) with £(F„) < p{n) for all n G N). 

We conjecture that WC2 even without new variables offers possibilities for good 
representations not offered by any UCk- 

Conjecture 8.1 There exists a polysize (-F'n)nGN in WC2, such that for no fc G Nq 
there exists a polysize CNF-representation (i^^)„gN o/(_F'„)„gpj in UCk- 



A proof of Conjecture 8.1 needed, besides the new handling of the new variables, 
to develop lower-bounds methods specifically for hardness, since the method via 
trigger hypergraphs yields already lower bounds for w-hardness. 

We conje cture that new variables can not simul ate h igher hardness, strengthen- 



ing Theorem |5.13| , Conjecture p.l4| and Conjecture |6^ 



Conjecture 8.2 For every fc G No there exists a polysize (i^n)neN 'in VCk+i, such 
that there is no polysize CNF-representation {F.I^)ni=K of {Fn)neN in WCk- 

Finally we conjecture that there is a sequence of boolean functions which has 
polysize arc-consistent representations, but no polysize representations of bounded 
hardness, even for the w-hardness: 
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Conjecture 8.3 There exists a polysize {Fn)neN in CCS, such that there is a poly- 
size CNF-representation {F^)nm of (i^„)«eN with hd™''(^"^(f;;) < 1 for all n G N, 
while for no fc G No there is a polysize CNF-representation (F")„gN of {Fn)n&i in 

Regarding our notion of a "polysize sequence" (i^„)„gN, this is a very liberal notion, 
allowing to express arbitrary boolean functions, since the number of variables could 
be logarithmic in the index, and thus Fn could contain exponentially many clauses 



in the number of variables. The sequence of Theorem 5.13 also fulfils n(F„) = r2(n), 
and making this provision one could speak of "simple" boolean functions, however 
this would complicate the formulations of our conjectures, and so we abstained from 
it. 

We conclude our considerations on hierarchies by considering the three hierar- 
chies 5£W7^(fc) introduced in §|, SCUn*{k) introduced in and CANON(fc) 
introduced in g] , which we have compared to the UC-hierarchy in js^, ^ . From 
the point of view of polysize representations without new variables, the hierarchy 
CANON(fc) collapses to CANON(O) = UCq: 

Lemma 8.4 For F e CCS let k{F) be the minimal € No such that F e CANON(fc). 
Then the function prcg : CCS — > CANON(O) = IACq can be computed in time 
0{c{F)^''^ ■ i{F)), when the input is F together with k :— k{F). 

Proof: Let K ■=2''. So for every C e prco(i^) there exists F' C F with F' ^ C 
and c(F') < K, since a resolution tree of height k has at most K leaves. Now we 
compute prCo(i^) as follows: 

1. Set P 0. 

2. Run through all F' C F with c{F') < K; their number is 0{c{F)^). 

3. For each F' determine whether F' |= puc(F') holds, in which case clause 
puc(F') is added to P; note that the test can be performed in time 0{2^ ■ K). 

4. The final P obtained has 0(c(F)^) many elements. After performing sub- 



sumption elimination (in cubic time) we obtain prCo(P) (by Lemma 3.7). □ 



It seems an interesting question whether the two other hierarchies SCUTZ{k), SCUTZ*{k) 
collapse or not, and whether they can be reduced to some UC^- 

8.3 Hard boolean functions handled by oracles 

Finally we turn to concrete (sequences of) boolean functions which are currently 
out of reach of good presentations, and where the use of oracles thus is necessary. 



Conjecture 6.2C says that systems of XOR-clauses (affine equations) have no 
good representation, even when just considering arc-consistency. So the conjecture 
is that here we have another example for the limitations of arc-consistent represen- 
tations as shown in To overcome these (conjectured) limitations, the theory 
started here has to be generalised via the use of oracles as developed in |^2[ ^ , and 
further discussed in Subsection 9.4 of |3^. The point of these oracles, which are 
just sets U C US AT of unsatisfiable clause-sets stable under application of partial 
assignments, is to discover hard unsatisfiable (sub-)instances (typically in polyno- 
mial time) . Thus they are conceptually simpler than the current integration of SAT 
solvers and methods from linear algebra (see |5^, |5^, 33 , 49 ) . 

An important aspect of the theory to be developed must be the usefulness of 
the representation (with oracles) in context, that is, as a "constraint" in a bigger 
problem: a boolean function / represented by a clause-set F is typically contained 
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in F' D F, where F' is the SAT problem to be solved (containing also other con- 
straints). One approach is to require from the oracle also stability under addition of 
clauses, as we have it already for the resolution-based reductions like r^, so that the 
(relativised) reductions r^ can always run on the whole clause-set (an instantiation 
of F'). However for example for the oracle mentioned below, based on semidefinite 
programming, this would be prohibitively expensive. And for some oracles, like 
detection of minimally unsatisfiable clause-sets of a given deficiency, the problems 
would turn from polytime to NP-hard in this way (j2^, |l^). Furthermore, that we 
have some representation, using for example some XOR-oracle, does not mean that 
in other parts of the problems also that oracle will be of help. So in many cases it 
is better to restrict the application of the oracle U to the subset F C F' , where to 
achieve the desired hardness the oracle is required. 

Another example of a current barrier is given by the satisfiable pigeonhole clause- 
sets PHP™, which have variables pij for i,j g {1, . . . , m}, and where the satisfying 
assignments correspond precisely to the permutations of {1, . . . , m}. The question 
is about "good" representations. In |^ we show hd(PHP™) = whd(PHP™) = 
m — 1, and so the (standard representation) PHP™ G CCS itself is not a good 
representation (it is small, but has high w-hardness). Actually, as explained in Q, 
from it follows that PHP™ has no polysize arc-consistent representation at all! 
So again, here oracles are needed; see Subsection 9.4 of |31, 3^ for a proposal of an 
interesting oracle (with potentially good stability properties). 



References 

[1] Carlos Ansotcgui, Maria Luisa Bonet, Jordi Levy, and Felip Manya. Measur- 
ing the hardness of SAT instances. In Dieter Fox and Carla Gomes, editors. 
Proceedings of the 23th AAAI Conference on Artificial Intelligence (AAAF08), 
pages 222-228, 2008. 

[2] Albert Atserias, Johannes Klaus Fichte, and Marc Thurley. Clause-learning 
algorithms with many restarts and bounded-width resolution. In Oliver Kull- 
mann, editor. Theory and Applications of Satisfiability Testing, volume 5584 
of Lecture Notes in Computer Science, pages 114-127. Springer, June 2009. 

[3] Olivier Bailleux and Yacine Boufkhad. Efficient CNF encoding of boolean 
cardinality constraints. In Francesca Rossi, editor. Principles and Practices of 
Constraint Programming - CP 2003, volume 2833 of Lecture Notes in Computer 
Science, pages 108-122. Springer, 2003. 

[4] Olivier Bailleux, Yacine Boufkhad, and Olivier Roussel. New encodings of 
pseudo-boolean constraints into CNF. In KuUmann [Q, pages 181-194. ISBN 
978-3-642-02776-5. 

[5] Olivier Bailleuz and Yacine Boufkhad. Efficient CNF encoding of boolean 
cardinality constraints. In Principles and Practice of Constraint Programming 
- CP 2003, volume 2833 of Lecture Notes in Computer Science, pages 108-122, 
2003. 

[6] Tomas Balyo, Stefan Gursky, Petr Kucera, and Vaclav Vlcek. On hierar- 
chies over the SLUR class. In Twelfth International Symposium on Artifi- 
cial Intelligence and Mathematics (ISAIM 2012), January 2012. Available at 
tittp : //www . OS .uic . edu/bin/view/Isaim2012/AcceptedPapers. 

[7] Pedro Barahoma, Jean Christoph Jun, George Katsirelos, and Toby Walsh. 
Two encodings of DNNF theories, July 2008. Presented at ECAr08 Workshop 



39 



on Inference methods based on Graphical Structures of Knowledge. Proceedings 
at http://www.irit.fr/LC/. 



[8] Christian Bcssiere, George Katsirelos, Nina Narodytska, and Toby Walsh. Cir- 
cuit complexity and decompositions of global constraints. In Proceedings of the 
Twenty-First International Joint Conference on Artificial Intelligence (IJCAI- 
09), pages 412-418, 2009. 

[9] Olaf Beyersdorff, Matthew Gwynne, and Oliver Kullmann. Hardness measures 
and resolution lower bounds, with applications to Pigeonhole principles. In 
preparation, to appear on arXiv, April 2013. 

[10] Armin Biere, Marijn J.H. Heule, Hans van Maaren, and Toby Walsh, editors. 
Handbook of Satisfiability, volume 185 Frontiers in Artificial Intelligence and 
Applications. lOS Press, February 2009. 

[11] Lucas Bordeaux and Joao Marques-Silva. Knowledge compilation with em- 
powerment. In Maria Bielikova, Gerhard Friedrich, Georg Gottlob, Stefan 
Katzenbeisser, and Gyorgy Turan, editors, SOFSEM 2012: Theory and Prac- 
tice of Computer Science, volume 7147 of Lecture Notes in Computer Science, 
pages 612-624. Springer, 2012. 

[12] Hans Kleine Brining and Xishun Zhao. The complexity of read-once resolution. 
Annals of Mathematics and Artificial Intelligence, 36(4):419-435, December 
2002. 

[13] Michael Buro and Hans Kleine Biining. On resolution with short clauses. 
Annals of Mathematics and Artificial Intelligence, 18(2-4);243-260, 1996. 

[14] Marco Cadoli and Francesco M. Donini. A survey of knowledge compilation. 
Journal of AI Communications, 10(3,4):137-150, December 1997. 

[15] Ondfej Cepek, Petr Kucera, and Vaclav Vlcek. Properties of SLUR formulae. 
In Maria Bielikova, Gerhard Friedrich, Georg Gottlob, Stefan Katzenbeisser, 
and Gyorgy Turan, editors, SOFSEM 2012: Theory and Practice of Computer 
Science, volume 7147 of LNCS Lecture Notes in Computer Science, pages 177- 
189. Springer, 2012. 

[16] Jiangchao Chen. Building a hybrid SAT solver via conflict-driven, look-ahead 
and XOR reasoning techniques. In Kullmann @, pages 298-311. ISBN 978- 
3-642-02776-5. 

[17] Stephen A. Cook. An exponential example for analytic tableaux. Manuscript 
(see page 432), 1973. 

[18] Yves Crama and Peter L. Hammer. Boolean Functions: Theory, Algorithms, 
and Applications, volume 142 of Encyclopedia of Mathematics and Its Applica- 
tions. Cambridge University Press, 2011. ISBN 978-0-521-84751-3. 

[19] Nadia Creignou, Phokion Kolaitis, and Heribert Vollmer, editors. Complexity 
of Constraints: An Overview of Current Research Themes, volume 5250 of 
Lecture Notes in Computer Science (LNCS). Springer, 2008. ISBN-10 3-540- 
92799-9. 

[20] Evgeny Dantsin and Edward A. Hirsch. Worst-case upper bounds. In Biere 
et al. 0, chapter 12, pages 403-424. 

[21] Adnan Darwiche and Pierre Marquis. A knowledge compilation map. Journal 
of Artificial Intelligence Research, 17:229-264, 2002. 



40 



[22] Adnan Darwiche and Knot Pipatsrisawat. On the power of clause-learning SAT 
solvers as resolution engines. Artificial Intelligence, 175(2):512-525, 2011. 

[23] Alvaro del Val. Tractable databases: How to make propositional unit resolution 
complete through compilation. In Proceedings of the 4th International Confer- 
ence on Principles of Knowledge Representation and Reasoning (KR'94), pages 
551-561, 1994. 

[24] Niklas Een and Niklas Sorensson. Translating pseudo-boolean constraints into 
SAT. Journal on Satisfiability, Boolean Modeling and Computation, 2:1-26, 
March 2006. 

[25] Helene Fargier and Piere Marquis. Extending the knowledge compilation map: 
Krom, Horn, Affine and beyond. In Malik Ghallab, Constantine D. Spyropou- 
los, Nikos Fakotakis, and Nikos Avouris, editors, ECAI 2008, volume 178 of 
Frontiers in Artificial Intelligence and Applications, pages 50-54. lOS Press, 
2008. 

[26] Herbert Fleischner, Oliver Kullmann, and Stefan Szeider. Polynomial-time 
recognition of minimal unsatisfiable formulas with fixed clause-variable differ- 
ence. Theoretical Computer Science, 289(1):503-516, November 2002. 

[27] John Franco and Allen Van Gelder. A perspective on certain polynomial-time 
solvable classes of satisfiability. Discrete Applied Mathematics, 125:177-214, 
2003. 

[28] Ian P. Gent. Arc consistency in SAT. In Frank van Harmelen, editor. Proceed- 
ings of the 15th European Conference on Artificial Intelligence (ECAI 2002), 
pages 121-125. lOS Press, 2002. 

[29] Matthew Gwynne and Oliver Kullmann. Towards a better understanding of 
SAT translations. In Ulrich Berger and Denis Therien, editors. Logic and 
Computational Co mplexity (LCC'll), as part of LICS 2011, June 2011. 10 



pages, available at http : / /viviv! . cs . Swansea, ac ■uk/lcc2011/ 



[30] Matthew Gwynne and Oliver Kullmann. Generalising and unifying SLUR and 
unit-refutation completeness. In Peter van Emde Boas, Frans C. A. Groen, 
Giuseppe F. Italiano, Jerzy Nawrocki, and Harald Sack, editors, SOFSEM 
2013: Theory and Practice of Computer Science, volume 7741 of Lecture Notes 
in Computer Science (LNCS), pages 220-232. Springer, 2013. 

[31] Matthew Gwynne and Oliver Kullmann. Generalising unit- refutation 
completeness and SLUR via nested input resolution. Technical Report 
arXiv:1204.6529v5 [cs.LO], arXiv, January 2013. 

[32] Matthew Gwynne and Oliver Kullmann. Generalising unit-refutation complete- 
ness and SLUR via nested input resolution. Journal of Automated Reasoning, 
2013. To appear. 

[33] Cheng-Shen Han and Jie-Hong Roland Jiang. When boolean satisfiability meets 
Gaussian elimination in a Simplex way. In Parthasarathy Madhusudan and 
Sanjit A. Seshia, editors. Computer Aided Verification, volume 7358 of Lecture 
Notes in Computer Science, pages 410-426. Springer, July 2012. 

[34] Marijn J. H. Heule and Hans van Maaren. Look-ahead based SAT solvers. In 
Biere et al. ||l^, chapter 5, pages 155-184. 



41 



[35] Paul Jackson and Daniel Sheridan. Clause form conversions for boolean cir- 
cuits. In Holger H. Hoos and David G. Mitchell, editors, Theory and Applica- 
tions of Satisfiability Testing 2004, volume 3542 of Lecture Notes in Computer 
Science, pages 183-198, Berlin, 2005. Springer. ISBN 3-540-27829-X. 

[36] Matti Jarvisalo and Tommi Junttila. Limitations of restricted branching in 
clause learning. Constraints, 14(3):325-356, 2009. 

[37] Matti Jarvisalo, Arie Matsliah, Jakob Nordstrom, and Stanislav Zivny. Re- 
lating proof complexity measures and practical hardness of SAT. In Michela 
Milano, editor. Principles and Practice of Constraint Programming - CP 2012, 
volume 7514 of Lecture Notes In Computer Science, pages 316-331. Springer, 
2012. 

[38] Matti Jarvisalo and Ilkka Niemela. The effect of structural branching on the 
efficiency of clause learning SAT solving: An experimental study. Journal of 
Algorithms, 63(1):90-113, July 2008. 

[39] Philipp Jovanovic and Martin Kreuzer. Algebraic attacks using SAT-solvers. 
Croups- Complexity- Cryptology, 2(2):247-259, December 2010. 

[40] Hans Kleinc Brining. On generalized Horn formulas and fc-resolution. Theoret- 
ical Computer Science, 116:405-413, 1993. 

[41] Hans Kleine Biining and Oliver KuUmann. Minimal unsatisfiability and au- 
tarkies. In Biere et al. ||l^, chapter 11, pages 339-401. 

[42] Oliver KuUmann. Investigating a general hierarchy of polynomially decidable 
classes of CNF's based on short tree-like resolution proofs. Technical Report 
TR99-041, Electronic Colloquium on Computational Complexity (ECCC), Oc- 
tober 1999. 

[43] Oliver KuUmann. An application of matroid theory to the SAT problem. In Fif- 
teenth Annual IEEE Conference on Computational Complexity (2000), pages 
116-124. IEEE Computer Society, July 2000. 

[44] Oliver KuUmann. The combinatorics of conflicts between clauses. In Enrico 
Giunchiglia and Armando Tacchella, editors. Theory and Applications of Sat- 
isfiability Testing 2003, volume 2919 of Lecture Notes in Computer Science, 
pages 426-440, Berhn, 2004. Springer. ISBN 3-540-20851-8. 

[45] Oliver KuUmann. Upper and lower bounds on the complexity of generalised 
resolution and generalised constraint satisfaction problems. Annals of Mathe- 
matics and Artificial Intelligence, 40(3-4) :303-352, March 2004. 

[46] Oliver KuUmann, editor. Theory and Applications of Satisfiability Testing - 
SAT 2009, volume 5584 of Lecture Notes in Computer Science. Springer, 2009. 
ISBN 978-3-642-02776-5. 

[47] Oliver KuUmann. Constraint satisfaction problems in clausal form II: Minimal 
unsatisfiability and confiict structure. Fundamenta Informaticae, 109(1):83- 
119, 2011. 

[48] Oliver KuUmann and Xishun Zhao. On Davis-Putnam reductions for minimally 
unsatisfiable clause-sets. Technical Report arXiv:1202.2600v5 [cs.DM], arXiv, 
December 2012. 



42 



[49] Tero Laitinen, Tommi Junttila, and Ilkka Niemela. Conflict-driven XOR-clause 
learning. In Alessandro Cimatti and Roberto Sebastiani, editors, Theory and 
Applications of Satisfiability Testing SAT 2012, volume LNCS 7317 of Lecture 
Notes in Computer Science, pages 383-396. Springer, 2012. ISBN-13 978-3- 
642-31611-1. 

[50] Joao Marques-Silva. Computing minimally unsatisfiable subformulas: State 
of the art and future directions. Journal of Multiple- Valued Logic and Soft 
Computing, 19(1-3):163-183, 2012. 

[51] David A. Plaisted and Steven Greenbaum. A structure-preserving clause form 
translation. Journal of Symbolic Computation, 2(3):293-304, 1986. 

[52] Olivier Roussel and Vaso Manquinho. Pseudo-boolean and cardinality con- 
straints. In Biere et al. |l^, chapter 22, pages 695-733. 

[53] Carsten Sinz. Towards an optimal CNF encoding of boolean cardinality con- 
straints. In Principles and Practice of Constraint Programming - CP 2005, 
volume 3709 of Lecture Notes in Computer Science (LNCS), pages 827-831. 
Springer, 2005. 

[54] Robert H. Sloan, Balazs Sorenyi, and Gyorgy Turan. On fc-term DNF with the 
largest number of prime implicants. SIAM Journal on Discrete Mathematics, 
21(4):987-998, 2007. 

[55] Mate Soos. Enhanced gaussian elimination in DPLL-based SAT solvers. Prag- 
matics of SAT, 2010. 



[56] Mate Soos, Karsten Nohl, and Claude Castelluccia. Extending SAT solvers to 



cryptographic problems. In KuUmann p6|, pages 244-257. tittp : //planete . 


inrialpes . f r/~soos/publications/Extending_SAT_2009 .pdf 





[57] Emanuel Sperner. Ein Satz iiber Untermengen einer endlichen Menge. Math- 
ematische Zeitschrift, 27(l):544-548, 1928. 



[58] Naoyuki Tamura, Akiko Taga, Satoshi Kitagawa, and Mutsunori Banbara. 
Compihng finite linear CSP into SAT. Constraints, 14(2):254-272, 2009. 

[59] Tomoya Tanjo, Naoyuki Tamura, and Mutsunori Banbara. A compact and 
efficient SAT-encoding of finite domain CSP. In Karem A. Sakallah and Laurent 
Simon, editors. Theory and Applications of Satisfiability Testing - SAT 2011, 
volume LNCS 6695 of Lecture Notes in Computer Science, pages 375-376. 
Springer, 2011. ISBN-13 978-3-642-14185-0. 

[60] Alasdair Urquhart. The complexity of propositional proofs. The Bulletin of 
Symbolic Logic, l(4):425-467, 1995. 

[61] Hans van Maaren. A short note on some tractable cases of the satisfiability 
problem. Information and Computation, 158(2):125-130, May 2000. 

[62] Vaclav Vlcek. Classes of boolean formulae with effectively solvable SAT. In 
Jana Safrankova and Jiri Pavlu, editors. Proceedings of the 19th Annual Con- 
ference of Doctoral Students - WDS 2010, volume 1, pages 42-47. Matfyzpress, 
2010. 



43 



